Int Instructions 2016: Fill & Download for Free

GET FORM

Download the form

How to Edit Your Int Instructions 2016 Online Free of Hassle

Follow these steps to get your Int Instructions 2016 edited with ease:

  • Select the Get Form button on this page.
  • You will enter into our PDF editor.
  • Edit your file with our easy-to-use features, like adding checkmark, erasing, and other tools in the top toolbar.
  • Hit the Download button and download your all-set document for reference in the future.
Get Form

Download the form

We Are Proud of Letting You Edit Int Instructions 2016 With the Best Experience

try Our Best PDF Editor for Int Instructions 2016

Get Form

Download the form

How to Edit Your Int Instructions 2016 Online

When you edit your document, you may need to add text, put on the date, and do other editing. CocoDoc makes it very easy to edit your form just in your browser. Let's see how this works.

  • Select the Get Form button on this page.
  • You will enter into our online PDF editor web app.
  • Once you enter into our editor, click the tool icon in the top toolbar to edit your form, like checking and highlighting.
  • To add date, click the Date icon, hold and drag the generated date to the field you need to fill in.
  • Change the default date by deleting the default and inserting a desired date in the box.
  • Click OK to verify your added date and click the Download button for the different purpose.

How to Edit Text for Your Int Instructions 2016 with Adobe DC on Windows

Adobe DC on Windows is a popular tool to edit your file on a PC. This is especially useful when you like doing work about file edit in your local environment. So, let'get started.

  • Find and open the Adobe DC app on Windows.
  • Find and click the Edit PDF tool.
  • Click the Select a File button and upload a file for editing.
  • Click a text box to optimize the text font, size, and other formats.
  • Select File > Save or File > Save As to verify your change to Int Instructions 2016.

How to Edit Your Int Instructions 2016 With Adobe Dc on Mac

  • Find the intended file to be edited and Open it with the Adobe DC for Mac.
  • Navigate to and click Edit PDF from the right position.
  • Edit your form as needed by selecting the tool from the top toolbar.
  • Click the Fill & Sign tool and select the Sign icon in the top toolbar to make you own signature.
  • Select File > Save save all editing.

How to Edit your Int Instructions 2016 from G Suite with CocoDoc

Like using G Suite for your work to sign a form? You can make changes to you form in Google Drive with CocoDoc, so you can fill out your PDF just in your favorite workspace.

  • Add CocoDoc for Google Drive add-on.
  • In the Drive, browse through a form to be filed and right click it and select Open With.
  • Select the CocoDoc PDF option, and allow your Google account to integrate into CocoDoc in the popup windows.
  • Choose the PDF Editor option to begin your filling process.
  • Click the tool in the top toolbar to edit your Int Instructions 2016 on the field to be filled, like signing and adding text.
  • Click the Download button in the case you may lost the change.

PDF Editor FAQ

Why is n++ faster than n=n+1?

Is it?Have you looked at the assembly to see if there’s any difference at all?Here’s a bit of C code.int binary(int i){  i = i + 1;  return i; }  int unary(int i){  i++;  return i; } Here’s the compiler I’m using.rountree4@cuyahoga ~$ clang --version Apple LLVM version 7.0.0 (clang-700.1.76) Target: x86_64-apple-darwin15.6.0 Thread model: posix Here’s how I generate the assembly without compiling it.rountree4@cuyahoga ~$ clang -S -O3 foo.c Here’s the assembly for binary:_binary: ## @binary  .cfi_startproc ## BB#0:  pushq %rbp Ltmp0:  .cfi_def_cfa_offset 16 Ltmp1:  .cfi_offset %rbp, -16   movq %rsp, %rbp Ltmp2:  .cfi_def_cfa_register %rbp  ## kill: EDI<def> EDI<kill> RDI<def>  leal 1(%rdi), %eax  popq %rbp  retq  .cfi_endproc And here’s the assembly code for unary._unary: ## @unary  .cfi_startproc ## BB#0:  pushq %rbp Ltmp3:  .cfi_def_cfa_offset 16 Ltmp4:  .cfi_offset %rbp, -16   movq %rsp, %rbp Ltmp5:  .cfi_def_cfa_register %rbp  ## kill: EDI<def> EDI<kill> RDI<def>  leal 1(%rdi), %eax  popq %rbp  retq  .cfi_endproc The generated code is identical, and thus the performance will be identical (all other parameters being equal).Nicolas Ortiz Sommerfeld comments:That’s NOT true!!! The performance WILL not be the same. Most processors have different instructions for adding and *incrementing*. Incrementing increases always by the same amount (1) so it should be technically faster because you do not need to load the immediate used for the second argument because you already know it’s 1. In conclusion, n++ is always a tiny bit faster!!!! In your case, your compiler doesn’t care and just outputs both options as just adding by 1 instead of incrementing which is pretty dumb.I will concede that an instruction sequence that require a separate instruction to load an immediate value might be slower than one that does not require a separate load. The lea instruction above does not require a separate load for the immediate value. The immediate value is encoded into the instruction.Telling me that I have a performance regression is no big deal. Telling the LLVM folks that they have a performance regression is a much larger deal. Telling them they have a performance regression in one of the most common code paths is something I would take on only with a great deal of caution. Telling them they screwed up their performance and not providing any data to demonstrate that is a good way to not be taken seriously.So how might you go about filing a performance bug with LLVM?First, let’s see if Intel does things differently.rountree@quartz386 ~/Quora$ icc --version icc_orig (ICC) 16.0.3 20160415 Copyright (C) 1985-2016 Intel Corporation. All rights reserved. Here’s the assembly for unary:# -- Begin unary  .text # mark_begin;  .align 16,0x90  .globl unary # --- unary(int) unary: # parameter 1: %edi ..B2.1: # Preds ..B2.0  .cfi_startproc ..___tag_value_unary.4: ..L5:  #6.17  incl %edi #7.2  movl %edi, %eax #8.9  ret #8.9  .align 16,0x90  .cfi_endproc  # LOE # mark_end;  .type unary,@function  .size unary,.-unary  .data # -- End unary And here’s the assembly for binary, except this time I’m using n=n+7.# -- Begin binary  .text # mark_begin;  .align 16,0x90  .globl binary # --- binary(int) binary: # parameter 1: %edi ..B1.1: # Preds ..B1.0  .cfi_startproc ..___tag_value_binary.1: ..L2:  #1.18  addl $7, %edi #2.10  movl %edi, %eax #3.9  ret #3.9  .align 16,0x90  .cfi_endproc  # LOE # mark_end;  .type binary,@function  .size binary,.-binary  .data # -- End binary Perhaps surprisingly, using the addl or incl instructions requires an additional movl to get the result into eax. That move is dependent on the result of the previous arithmetic instruction; the two cannot execute in parallel.So which is faster?Here’s the test rig.#include <stdio.h> #include <sys/time.h> #include <stdint.h>  int binary(int i){  i = i + 7;  return i; }  int unary(int i){  i++;  return i; }  int a; volatile uint64_t i, j;  int main(){  struct timeval start, end;    gettimeofday(&start, NULL);  for(i = 0; i<100; i++){  for (j=0; j<1000000; j++){  a = binary(a);  }  }  gettimeofday(&end, NULL);   fprintf(stdout, "Binary time in seconds: %lf\n",   (end.tv_sec - start.tv_sec) +   (end.tv_usec - start.tv_usec)/1000000.0);   gettimeofday(&start, NULL);  for(i = 0; i<100; i++){  for (j=0; j<1000000; j++){  a = unary(a);  }  }  gettimeofday(&end, NULL);   fprintf(stdout, "Unary time in seconds: %lf\n",   (end.tv_sec - start.tv_sec) +   (end.tv_usec - start.tv_usec)/1000000.0);   gettimeofday(&start, NULL);  for(i = 0; i<100; i++){  for (j=0; j<1000000; j++){  }  }  gettimeofday(&end, NULL);   fprintf(stdout, "Loop time in seconds: %lf\n",   (end.tv_sec - start.tv_sec) +   (end.tv_usec - start.tv_usec)/1000000.0);   return !a; } Here are the results for the Intel icc compiler.rountree@quartz386 ~/Quora$ ./icc.out  Binary time in seconds: 0.270814 Unary time in seconds: 0.270599 Loop time in seconds: 0.231791 And here are the results from gcc 4.9.3, as I don’t have clang and icc on the same machine, but gcc also uses identical lea-based code for both functions.rountree@quartz386 ~/Quora$ ./gcc.out  Binary time in seconds: 0.232028 Unary time in seconds: 0.231783 Loop time in seconds: 0.231703 The loop time is close enough to identical. In loop+function time, though, the lea code is significantly faster. How much so?Function time (seconds)  binary unary gcc 0.000245 0.000080 intel 0.039023 0.038808 In this particular instance, claiming inc was faster than lea was wrong by two orders of magnitude.I’m looking forward to giving Intel a hard time about this when I’m visiting them in Hillsboro next week.

How fast can a supercomputer loop 4228250625 (255 ^ 4) times?

I wrote the following program:int main(int argc, char *argv[]) {  long i;  for (i = 0; i < 4228250625L; i += 1);  return 0; } which compiles to:main: .LFB0:  .cfi_startproc  movl $4228250625, %eax .L2:  subq $1, %rax  jne .L2  movl $0, %eax  ret When I run it on my 2016 Macbook Pro it takes 5.3 seconds.$ time ./a.out real 0m5.317s user 0m5.315s sys 0m0.000s So that is about 800 million loops per second. Of course the loop is only two instructions, so yeah.When I compile with optimization, the compiler figures out that the loop doesn’t do anything and just leaves it out, so that is fast!On a supercomputer, you would want to parallelize this a few thousand ways, so maybe a millisecond total?UPDATE:If you nest the loops, like this:int main(int argc, char *argv[]) {  for (unsigned i = 0; i < 255; i += 1)  for (unsigned j = 0; j < 255; j += 1)  for (unsigned k = 0; k < 255; k += 1)  for (unsigned l = 0; l < 255; l += 1);  return 0; } It is 5% slower:$ time ./a.out real 0m5.474s user 0m5.468s sys 0m0.005s SECOND UPDATE:I added code to enable OpenMP so I can use a 16-core 32-thread server I have access to:#include "omp.h" int main(int argc, char *argv[]) { #pragma omp parallel for  for (unsigned i = 0; i < 255; i += 1)  for (unsigned j = 0; j < 255; j += 1)  for (unsigned k = 0; k < 255; k += 1)  for (unsigned l = 0; l < 255; l += 1);  return 0; } Now I can compile -fopenmp and run the program withOMP_NUM_THREADS=N time ./a.outfor various values of N up to 32.For 1 thread on this server it takes 8.4 seconds (slower than my macbook) but for 32 threads it runs in 0.44 seconds. 16 threads runs in 0.68 seconds. These last two data points are important because they illustrate that programs that do not do memory references don’t benefit as much from hyperthreading. Still, I am now doing north of 8 billion loops per second.

How is source code translated into machine language?

Physical phenomena? Oh boy.If you're asking for physical phenomena, we'll need to get down to a much lower level than such strange, abstract ideas of "source code" and "machine language". I'll start with the physical phenomena and work my way up to the level of compiling and machine code execution, but I hope you're ready for a bit of light reading along the way. Don't worry— I've written everything here with a layman reader in mind.The Physical LevelIn terms of physical phenomena, it all just boils down to voltages [1]. In the field of electronic circuits, there's a particular handy device called a "transistor" that basically allows you to control voltages by means of other voltages— it's an electronic switch, essentially.A diagram of a transistor that I stole from WikipediaTransistors[2] have three "terminals" that you can attach wires to, labeled "G", "S", and "D". If you apply a high voltage to the "G" terminal, the "S" and "D" terminals are electrically connected together and allow electric current to flow through. But if the "G" terminal has a low voltage, the "S" and "D" terminals are electrically disconnected [3]. It should be simple to imagine that, using this "transistor" device, you can build a circuit that uses voltages to manipulate other voltages. You can do this by systematically selecting parts of the circuit to connect or disconnect from each other by using transistors, which causes changes in voltage that affect other transistors, and so on.[1] I hope you paid attention in whatever introductory EM physics class you took, if any. But on the off chance that you didn't take such a class, or you fell asleep in it, a "voltage" is really just a particular way of measuring electrical potential energy that results from placing an electrical charge in a particular location in an electric field. A "high voltage" means that there is high electrical potential energy per unit of electric charge. If you still have no idea what I'm talking about, just think of "high voltage" as very roughly meaning "high energy"![2] Specifically in the example I give here, we're talking about a particular type of transistor called a "MOSFET".[3] This is an enormous oversimplification of the physical equations that describe how MOSFETs actually work, but the "voltage switch" idea is a useful one for conceptualizing how we build circuits that take advantage of these things.The Logical LevelNow for the next layer of abstraction. At the most basic level in a computer (or any other digital device, really), we can "pretend" that something carrying a high voltage should represent the idea of "truth", and something carrying a low voltage should represent the idea of "falsehood". We can use those transistor circuits we built in the prior "physical" level to create devices called "logic gates". These are just physical devices that represent the ideas of "logical AND", "logical OR", and "logical NOT"— equivalent to what you'd find in basic mathematical logic, except now in physical form (being made out of transistors) rather than just existing on paper in the form of a logic equation.These are diagrams (that I stole from Wikipedia) of some possible ways we can construct logical AND, OR, and NOT gates using transistors. The double blue lines represent transistors, and the squiggly lines represent resistors. So, for instance, in the NOT device, if we apply a high voltage (“true”) to point a, that would cause point F to have a low voltage (“false”), and if we apply a low voltage (“false”) to point a, that would cause point F to have a high voltage (“true”). Essentially, this is logical NOT in circuit form.We can combine these AND gates, OR gates, and NOT gates in a huge number of ways to transform the "true" and "false" signals into new "true" and "false" signals. So with this idea of manipulating truth and falsehood using logic gates, we can build circuits that can calculate whether things are true or false, just by setting the inputs of the gates and seeing what comes out of the other side, and checking whether the output is a high voltage (true) or a low voltage (false).From these simple truth signals represented by voltages, we can now construct a basic system that can represent arithmetic. In our high/low voltage setup, we'll use the base-2 (binary) number system rather than our usual base-10 (decimal) system, since we only really have two possible "digits" (the true/false signals) available for us to use [1]. We'll pretend the "true" signals are actually the digit "1", and likewise, we'll pretend that the "false" signals are actually the digit "0". By combining the aforementioned AND, OR, and NOT gates in creative ways, we can create new devices that can perform addition, subtraction, and so on just by looking at the "digits" and combining/transforming them through sets of logic gates to get new "digits" that represent a "sum", a "difference", or whatever other "arithmetic" operation we want. (In fact, all you really need at this level is addition, since subtraction is just adding a negative number[2], multiplication is repeated addition[3], and division is repeated subtraction... sort of [4]).[1] Binary arithmetic is not much different from decimal arithmetic. In fact, it's usually a lot simpler. If you're not already familiar with binary arithmetic, here's an article describing how it works.[2] It turns out that in order to efficiently represent negative binary numbers using hardware, you need to take advantage of some clever mathematical properties of addition and carrying.[3] It's usually a lot more complicated than just saying that "multiplication is repeated addition", because if you want the multiplication to happen "fast", you don't want to just stack a large number of addition devices together and call it a day.[4] Repeated subtraction with binary whole numbers kind of only works for division with remainders. As it turns out, representing fractional numbers using limited amounts of binary digits is a fairly hard problem.Something to Remember Me ByWe're not nearly up to the level of a CPU yet, though. Next, we need to introduce the idea of "state" and "memory". In other words, we need to build devices that can "remember" the results of prior truth/arithmetic calculations. It's not enough to just take the result of some calculation and see what it is right now, we must also be able to store that result so we can use it again at a later time.We need a device that has "memory"— one whose "truth/falsehood" output value doesn't just depend on its inputs, but also on its entire history of inputs leading up to the present moment. This is the idea of a "latch" device. A "latch" is nothing more than a very particular arrangement of logic gates[1] that "remembers" its last input and continually outputs that value forever until you change it to something else. Using this property of latches, we can build circuits that not only do arithmetic/truth calculations, but can also "remember" those calculations and use the results for later calculations.Now that we have introduced the dimension of time (since our calculations can now persist and change over time via memory devices), we'll need some way to synchronize all these calculations, so we can be sure that all the timed memory-based calculations give us the correct answer at the correct time. The way digital electronics do this is through a global voltage signal called a "clock". The clock signal[2] is just a voltage that oscillates between "high" and "low" voltages really fast, but at a very regular known interval. We can design our logic circuits to synchronize along this global clock signal and give us the "next stage" in our multi-step calculations each time the clock transitions from "low" to "high".Now we can build more advanced "latch" devices that synchronize along this clock signal— we'll call them "flip-flops". Flip-flops are able to “remember” results (like latches), but only do so whenever the clock signal rises from low to high. (This allows us to synchronize each stage of a multi-step calculation with the clock signal.)[1] Specifically, a latch is composed of multiple negating logic gates that feed their own outputs back into their inputs. It's weird thinking about that at first, but it actually works exactly as you would expect "memory" to work, and it's pretty clever.[2] As a layman, you may have heard of this mystical "clock signal" before, in terms of a marketing number used to advertise how "fast" your CPU supposedly is. The "clock speed" of a CPU really just determines how fast this "global clock signal" flips back and forth between "high" and "low" voltage. (And before you ask— no, clock speed is not the only indicator of CPU performance, and it's not always a reliable way to determine how fast your CPU will run a given program).Bigger, Badder Memory DevicesBut our "flip-flop" devices can each only output one signal at a time. At any given time, a flip-flop either has the value of "1" or "0", but never both at once. To store most meaningful numbers, we need to be able to store more binary digits than that— and so, we introduce the idea of "registers". A register is an "enhanced flip-flop" that can store multiple digits at once (really we can just make registers by putting a whole bunch of flip flops together, but it's easier to group them together and call them by a new name). Like a flip-flop, a register is synchronized by a clock signal, and can "remember" its last input for however long you like. The main difference is that it can "remember" more than one digit at once— we now have the capability to store entire binary numbers, not just single binary digits (or "bits", for short).Cool, but what if we want hundreds or even thousands of numbers to be remembered? Enter the idea of a RAM module [1]. A RAM module functions similarly to a huge number of registers working together [2]. It's a device that has a large amount of "memory cells" all lined up in a row-- each cell is numbered with an ID (called its "address"), and each cell can store a single number composed of multiple bits [3].A crude diagram of a simplistic RAM module that I drew in about 30 seconds in MS PaintTo store a number in a RAM module, you must first connect a binary number representing a cell address to the module's "address" input, then connect the desired number you want to store into the module's "data" port, and then set the module's "write enable" signal to high voltage (“true”). After doing this, the number will be placed into the memory cell corresponding to whatever number you set as the "address".To load a number from RAM, all you need to do is set the "address" input to the desired address of the memory cell you want (with the “write enable” signal set to low voltage), at which point the module will spit out the corresponding cell's contents out of its "data" port.[1] AKA Random-Access Memory— Another term a layman has probably heard of before![2] Though in practice, it's usually not built out of registers.[3] Usually 8 bits, otherwise known as a "byte".A Reminder!Perhaps you have forgotten by now, but you should keep in mind that underlying all this, we're really only just "pretending" that numbers exist. Underneath it all, it's actually just a bunch of electrical voltages being manipulated by layers upon layers of different devices, with each layer adding an increasing amount of complexity and abstraction. And there are still more layers yet to come!Building a Basic ProcessorNow that we have registers and RAM modules and the like, we can now build circuits called "register-transfer datapaths". That's really just a fancy schmancy term that means "some circuit that can move data in stages between a bunch of different memory spots, like registers and RAM modules, in a useful manner".The great mathematician Alan Turing [1] once envisioned a device that could be reconfigured to perform any number of different tasks without having to change any of the internal circuitry— all you'd have to do is feed it different instructions, and it'll execute each instruction step by step. To change what the machine does, you just give it different instructions— no internal circuitry modification required. Let's build a register-transfer datapath that can act as such a "Turing Machine"[3] now.We'll use a RAM module to hold the list of instructions that make up our "program". Each memory cell in RAM will hold a binary number that will represent an instruction for our "Turing Machine" to execute. For instance, address #0 will hold the first instruction, address #1 will hold the next instruction, and so on [4]. We'll also put a register in our device, called the "Program Counter", or "PC" for short [5]. The role of the PC is to store the address of the current instruction in RAM that we want to execute-- it holds the current location that we are at during the current stage of program execution.Let's also throw in a whole bunch of general-purpose registers into our device, to act as a convenient "scratch space" that we can use while executing instructions. For our hypothetical example, we'll label them "R0" through "R15" (i.e. 16 general-purpose "scratch space" registers in total).In order to perform actual numerical and logical computations, let's also construct a logic sub-circuit capable of performing addition, subtraction, multiplication, or whatever arithmetic/logical operation we want [6]. This sub-circuit will really be just a set of logic gates that we can use at will. We'll throw that in our datapath circuit and call it an "Arithmetic Logic Unit", or "ALU" for short.And of course, tying all the components together is a clock signal that oscillates between "high" and "low" voltages at a steady rate, to help us synchronize the different components in the circuit along a common timekeeping standard.The components of our register-transfer circuit (Crude MS Paint Edition!)[1] Otherwise known as the "Father of Modern Computer Science", AKA the really cool crazy dude from The Imitation Game [2].[2] No, he was not actually that crazy in real life. In fact, I'd guesstimate that at least 50% of the "facts" in that movie are completely made up. Still a movie that I recommend seeing though, as a pretty decent drama film.[3] Not actually a Turing Machine in the formally-defined mathematical sense, although ideally it will be equivalent in power to an actual Turing Machine.[4] Most actual CPUs don't actually encode all their instructions in single bytes. Much of the time, you need more than one byte to encode a useful instruction, which means each instruction takes up more than one RAM cell. But for the sake of simplicity, we'll go with the "1 byte = 1 instruction" thing for this hypothetical example.[5] Sometimes, the Program Counter (“PC”) is called the “Instruction Pointer”, or “IP” for short.[6] Reminder: We've known how to do this sort of thing ever since we realized that electrical voltages could be used to represent "truth", "falsehood", the digit "1", and the digit "0", and that we could manipulate these values using AND, OR, and NOT gates (which are themselves built out of transistors that manipulate the voltages).How Our Big Fancy Device WorksFinally, we're ready to describe how our datapath circuit operates. For the purposes of this example circuit, we'll break up the operation in terms of five stages:"Fetch", "Decode", "Execute", "Memory", and "Write Back".First comes the "Fetch" stage— during this stage, we use the value of the PC register (i.e. the Program Counter) as a RAM address to extract a desired instruction from a particular cell in the RAM module. In other words, we "fetch" an instruction from RAM so we can take a look at what it wants us to do.Next, comes the "Decode" stage— we'll take the instruction that came out of the RAM module, and determine what that instruction wants us to do. Perhaps the instruction is telling us to write a value to one of the general-purpose registers. Or perhaps it wants us to add the values of two general-purpose registers, and store the result in another register. Maybe it wants to multiply two registers. Maybe it wants us to subtract. Whatever the instruction tells us to do, this is the stage where we find out by examining the individual bits of the instruction that we got out of the RAM module.After determining what the instruction wants us to do during the previous stage, we then spend a stage performing the calculation that it wants. This is the "Execute" stage. For instance, if we determined that the instruction was an "ADD" instruction that requests us to add two register values and store the result in a third, we'll switch the ALU to binary addition mode, take the values of the two requested registers, and route them through the ALU. Whatever we get out of the ALU after doing this should be the result of our addition operation. We'll hang on to the result for the next couple of stages.Next comes the "Memory" stage. Perhaps the instruction is telling us to store some value to some address in RAM, or even read a particular RAM cell and store that value in a general-purpose register. This is where we interact with the RAM and either put a value in, or get some value out [1].Finally we reach the "Write Back" stage. At this point, we write the result of our execution back to some target register. For instance, if the instruction told us to add two registers and store the result in a third register, this is the stage where we finally write the result back into that third register.After all this is done, we can then use the ALU to add 1 to the PC register. This sets up our datapath for the next cycle, in which we fetch, decode, and execute the next instruction in RAM. Of course, perhaps the instruction we just executed is telling us to jump to a completely different location in the program, and begin executing the instructions starting from that new location. In that case, instead of adding 1 to the PC, we'll load the "jump target" address to the PC. This will set up our datapath to execute instructions starting from the requested jump target [2].ORCongratulations, we have just built an (incredibly primitive) CPU! Really if you think about it, it's just nothing more than a circuit that manipulates voltages using layers upon layers of increasing complexity. But with enough layers of abstraction and complexity, we were able to create a device that was able to execute instructions, one by one. And to change what the device does, all we have to do is change the contents of the RAM module-- we don't have to re-wire the circuit or change any hardware components at all![1] If you're paying attention, you'll notice that we're using the RAM module not only to store the program instructions, but also as a storage space where we can read and write data. This idea of storing the instructions and the data in the same place is central to a school of CPU design called the "Von Neumann Architecture". Critically, it's what allows us to write programs that manipulate programs, since programs themselves are really just a bunch of numbers in RAM, no different from any other number in RAM. This is why your Operating System (which is itself a computer program) can manipulate other programs (for instance, fetch other programs from your hard drive and store them into RAM), before executing them. Unfortunately, it's also what allows hackers/crackers to construct malicious data in such a way that your CPU may accidentally execute the "data" as if it were actually program code, if you're not careful (i.e. this allows them to hide malicious programs in what seem to be "normal" data files like a JPG, for instance).[2] If you're familiar with C or BASIC, this is the machine-level equivalent to a GOTO statement. In addition to basic jumps, an instruction might also encode a conditional jump, in which it only jumps under certain conditions-- perhaps it will only jump if the value in some general-purpose register is equal to 0, or perhaps it will only jump if the last ALU operation was negative, etc. This idea is the machine-level equivalent of an if statement.Writing Programs for Our Big Fancy CPU ThingSo now that we have a device that can execute programs based on binary numbers stored in RAM, how do we go about writing programs for this device? The simplest way would be to write your program in "human readable form", with each CPU instruction on each line, and then translate each instruction, by hand, into the instruction number sequences required. Then we'll just load those numbers into RAM, and voila! We have a machine-coded program that our machine can execute.For instance, if we have the following "human readable program" [1]:SET Register 1's value to 100 SET Register 2's value to 23 ADD Register 1 and Register 2, and store the result in Register 3 MULTIPLY R1 and R2, and store the result in R4 STORE R3's contents in RAM address 42 STORE R4's contents in RAM address 9001 It might be represented in the following way in RAM:1 1 100 1 2 23 2 1 2 3 3 1 2 4 4 3 42 4 4 9001 In this (admittedly very primitive) example, the number "1" would represent the SET instruction, the digit "2" would represent an ADD instruction, "3" represents MULTIPLY, and "4" represents STORE [2].Of course, for larger programs, doing this kind of translation by hand is extremely time-consuming. So let's expedite this process a bit.Let's write a program for this CPU that does this boring translation work for us! After all, this is why we built the thing, no? It's our personal boring-work laborer that will happily do anything we ask it to.Of course this first "translator" program itself needs to be translated by hand. But that's okay-- you only need to do the hand translation once. Once you have this translator program, you can use that translator program to translate other programs. Voila! You now have your very own "assembler". This "assembler" program is designed to do a 1-to-1 translation[3] of your "human readable" program (represented in RAM using ASCII) into the machine language binary number representation.[1] Otherwise known as an "Assembly program". Real Assembly programs tend to be much more formal and more structured than this example, and also tend to look a lot less like English.[2] Real machine instructions in RAM are obviously much, much more complex than this, and are always stored in binary rather than decimal form.[3] In reality, the translated machine code may not actually be a 1-to-1 translation of the human-readable assembly program into machine code, due to certain limitations on the way that the machine code represents instructions. However, most of the time it usually is a 1-to-1 translation.Higher Level Programs!Of course, "human readable" assembly language is kind of a drag, and doesn't always represent what you want to do very easily. To write in assembly, you really (quite literally) need to speak the language of the CPU. Not everyone wants to do this. Sometimes you need a language that can represent more abstract concepts than CPU machine code can. That's where higher-level languages like C and C++ come in [1].But our CPU can't possibly understand a language like C! After all, it's only built to understand and execute machine instructions. Admittedly though, we did just write a program that can translate human assembly into machine instructions. What if we could write another program that translates C language into assembly language? Then we could "chain" together the translations and eventually end up with a machine code representation of a C program!This is where "compilers" come in. A compiler is a program that turns a higher-level language, like C or C++, into assembly language. Once the assembly language translation is done, then we can use an assembler to translate the assembly into machine code, at which point the program will be ready to execute on the target CPU.For instance, if we have the following C code:int main() {  int x = 6;  int y = 9;  return x + y; } Our "compiler" might translate that C code into the following assembly code:SET R1's value to 6 SET R2's value to 9 ADD R1 and R2, and store the result in R3 STORE R3's contents in RAM address [some_address_goes_here] We can then run our assembler on that assembly code to produce this machine code:1 1 6 1 2 9 2 1 2 3 4 3 [some_address_goes_here] And at last, in this form, the CPU can finally execute the instructions. Since our compiler generated assembly code that was equivalent to the C code, and our assembler generated machine code that was equivalent to the assembly code, basic logic tells us that the final machine code should be equivalent to the original C code that we wrote.[1] It's kind of funny how in this situation a C program is considered to be "high level". Most of the time we programmers consider it to be very "low level" compared to other languages!Even Higher Level Programs (Program-ception)!But what about Java? Or Python? Or BASIC or JavaScript or Ruby or PHP or Perl? None of those programming languages are "compiled" into machine instructions! Java is compiled, sure, but it's compiled into something called "JVM Bytecode". That's not at all what the CPU uses to execute instructions!Many languages that are higher level than C or C++ are "interpreted languages". That is, they are not compiled to machine code and executed as machine code. So how are they executed? How does the CPU still know how to run the code?The answer is to write a "virtual machine", or an "interpreter". A virtual machine (or interpreter)[1] is a computer program that itself "pretends" to be something similar to a CPU— it is a program (which itself is compiled to machine code), that takes other instructions in the form of "bytecode" and executes those instructions. Essentially it's a program that executes other programs! (Program-ception, anyone?)A typical virtual machine (in this example, written in pseudo-C) might look something roughly like this:int main() {  // Load the program instructions into an array.  char* instructions;  load_program("some_program_file.pyc", &instructions);   // Virtual "program counter".  unsigned int pc = 0;   // Enter the main loop.  while (1) {  // Fetch the instruction using the "program counter".  char instruction = instructions[pc];   // Decode the instruction.  if (instruction == ADD) {  /* Execute the ADD instruction */  }  else if (instruction == SUBTRACT) {  /* Execute the SUBTRACT instruction */  }  else if (instruction == MULTIPLY) {  /* Execute the MULTIPLY instruction */  }  /*  ...  And so on, and so forth, etc.  ...  */  else if (instruction == HALT) {  /* Program finished! */  unload_program(instructions);  return 0;  }  else {  /* Invalid instruction! */  unload_program(instructions);  return 1;  }   // Increment the "program counter" to fetch   // the next instruction.  pc++;  }   // Under normal circumstances, we should never reach here.  return 1; } If you're unfamiliar with C, this roughly describes a program that loads another program's instructions into memory, and then in a loop, goes through those instructions and executes them one by one until the instructions tell it to stop. Very similar to what a CPU does, except this time it's in software rather than hardware!This code would then itself be compiled into assembly, then machine code, and that resulting program would then be the "virtual machine"-- a program that is able to execute other programs written in higher-level languages.[1] Sometimes virtual machines are considered to be the same as an interpreter, sometimes not. Usually the difference lies in how similar the instructions look like to actual machine code, although the distinction can get really blurry at times. (For instance, the standard Python "interpreter" is really both a compiler and a virtual machine all rolled into one.)ConclusionAnd that, as you asked, is the "physical" phenomenon that occurs when source code is translated to machine language, and then executed. In the end, it really boils down to nothing more than voltages-- voltages, and a lot of layers of complexity piled on top.I know that was kind of a wall o' text, but if you actually made it to the bottom here by at least skimming the prior sections and trying to understand them (bonus points if you read all the little notes at the bottom of each section), then that may be a good indication that you may be cut out for the field of Computer Engineering. Almost everything I wrote is a gross oversimplification of how things are really done, but I tried hard to briefly give a bit of a "vertical slice" of all the very basic concepts required at each layer of complexity/abstraction.Computer Engineering is an absolutely enormous field. I am certain that no single person on this planet knows absolutely everything there is to know about how computers work. Building a computer from scratch requires the effort of thousands of people of all manners of expertise, at all levels of complexity as described above, and perhaps even more. I am by no means an expert in all of those sub-fields, but I do know enough about each layer to be able to write up this... uh... "brief summary" of most of the layers of abstraction involved.If you are interested in this stuff, it really is a magical experience in Comp. E. to be able to finally understand even the basics of how everything just "snaps" into place-- that thing in your laptop/desktop that we call a "CPU" isn't just a dumb hunk of metal that generates a lot of heat! There's actually rhyme and reason behind what it does, and generations of research and work that went into making it what it is today.EDIT (2016–8–26): Added some images and diagrams, to break up the wall o’ text a bit and help explain what’s going on.

People Trust Us

The use of widgets is great and allows for easy information collection!

Justin Miller