NASM Manual
User Manual:
Open the PDF directly: View PDF .
Page Count: 104
Download | |
Open PDF In Browser | View PDF |
Introduction to NASM Prepared By: Muhammed Yazar Y Updated By: Sonia V Mathew Govind R Darshana Suresh - Dheeraj Mohan - Jyothsna Shaji Lakshmi Alwin - Naveen Babu - Nikhil Sojan Nileena P.C. - Sanju Alex Jacob - Vrindha K B Tech Dept: of CSE - NIT Calicut Under The Guidance of: Mr. Jayaraj P B Mr. Saidalavi Kalady Assistant Professors Dept: of CSE NIT Calicut Department Of Computer Science & Engineering NATIONAL INSTITUTE OF TECHNOLOGY CALICUT 2019 1 Contents 1 Basics of Computer Organization 3 2 Introduction to NASM 15 3 Basic I/O in NASM 30 4 Introduction to Programming in NASM 34 5 Integer Handling 38 6 Subprograms 44 7 Arrays and Strings 52 8 Floating Point Operations 74 1 Acknowledgement I would like to express my gratitude to Saidalavy Kalady Sir and Jayaraj P B Sir (Assistant Professors, Dept: of CSE, NIT Calicut) for guiding me throughout while making this reference material on NASM Assembly language programming. Without their constant support and guidance this work would not have been possible. Thanks to Lyla B Das madam (Associate Professor, Dept: of ECE, NIT Calicut) for encouraging me to bring out an updated version of this work. I would also wish to thank Meera Sudharman, my classmate who helped me in verifying the contents of this document. Special thanks are due to Govind R and Sonia V Mathew (BTech 2011-2015 Batch) for updating the contents and adding more working examples to it. I am extremely grateful to Dheeraj Mohan, Nikhil Sojan, Darshana Suresh, Jyothsna Shaji, Lakshmi Alwin, Vrindha K, Nileena P.C., Sanju Alex Jacob and Naveen Babu (2016 - 2020 batch) for restructuring the manual with different learning strategies and working examples. I want to Thanks to all my dear batch mates and juniors who have been supporting me through the work of this and for providing me with their valu- able suggestions. Muhammed Yazar 2 Chapter 1 Basics of Computer Organization Machine Language Machine language consists of instructions in the form of 0âĂŹs and 1âĂŹs. Every CPU has its own machine language. It is very difficult to write programs using the combination of 0âĂŹs and 1âĂŹs. So we rely upon either assembly language or high level language for writing programs. Assembly Language An assembly language is a low-level programming language for microprocessors and other programmable devices. Assembly language uses a mnemonic to represent each low-level machine instruction. Why Assembly Language ? The symbolic programming of Assembly Language is easier to understand and saves a lot of time and effort of the programmer. When you study assembly language, you will get a better idea of computer organization and how a program executes in a computer. A program written in assembly language will be more efficient than the same program written in a high level language. Some portions of Operating System (Eg: Linux kernel) and some system software are written in assembly language. In programming languages like C, C++ we can even em- 3 bed assembly language instructions into it using functions like asm( ); (Refer to Chapter 8 of ’The Intel Microprocessors’ by Barry B. Brey for more details) What is NASM ? The Netwide Assembler is an assembler and disassembler for the Intel X86 architecture (explained in subsequent section). It can be used to write 16-bit, 32-bit (IA-32) and 64-bit (x86-64) programs. NASM is considered to be one of the most popular assemblers for Linux. Computer Architecture The basic operational design of a computer is called architecture. It is a set of rules and methods that describe the functionality, organization and implementation of computer systems. X86 architecture follows Von Neumann architecture which is based on stored program concept. Von Neumann Architecture Von Neumann architecture machine, designed by physicist and mathematician John Von Neumann (1903 - 1957) is a theoretical design for a stored program computer that serves as the basis for almost all modern computers. A Von Neumann machine consists of a central processor with an arithmetic/logic unit and a control unit, a memory, mass storage, and input and output. X86 Architecture The x86 architecture is an Instruction Set Architecture (ISA) series for computer processors. Developed by Intel Corporation, x86 architecture defines how a processor handles and executes different instructions passed from the operating system (OS) and software programs. The ’x’ in x86 denotes ISA version. Key features include: • Provides a logical framework for executing instructions through a processor. • Allows software programs and instructions to run on any processor in the Intel 8086 family. 4 Figure 1.1: Von Neumann Architecture • Provides procedures for utilizing and managing the hardware components of a central processing unit (CPU). Historically there have been 2 types of Computers: Fixed Program Computers - Their function is very specific and they can’t be programmed, e.g. Calculators. Stored Program Computers - These can be programmed to carry out many different tasks, applications are stored on them, hence the name. Processor Processor is the brain of the computer. It performs all mathematical, logical and control operations of a computer. It is the main component of the computer which executes all the instructions given to it in the form of programs. It interacts with I/O devices, memory (RAM) and secondary storage devices and thus implements the instructions given by the user. The term processor is used interchangeably with the term central processing unit (CPU), although strictly speaking, the CPU is not the only processor in a computer. 5 Figure 1.2: Intel Core i9 Registers Registers are the most immediately accessible memory units for the processor. They are the fastest among all the types of memory. They reside inside the processor and the processor can access the contents of any register in a single clock cycle. It is the working memory for a processor, i.e, if we want the processor to perform any task it needs the data to be present in any of its registers. What is a clock cycle? In computers, the clock cycle is the amount of time between two pulses of an oscillator. It is a single increment of the central processing unit (CPU) clock during which the smallest unit of processor activity is carried out. The clock cycle helps in determining the speed of the CPU, as it is considered the basic unit of measuring how fast an instruction can be executed by the computer processor. height. The series of processors released on or after 80186 like 80186, 80286, 80386, Pentium etc are referred to as x86 or 80x86 processors. The processors released on or after 80386 are called I386 processors. They are 32 bit processors internally and externally. So their register sizes are generally 32 bit. In this section we will go through the i386 registers. 6 Intel maintains its backward compatibility of instruction sets, i.e, we canrun a program designed for an old 16 bit machine in a 32bit machine.That is the reason why we can install 32-bit OS in a 64 bit PC. The onlyproblem is that, the program will not use the complete set of registers andother available resources and thus it will be less efficient. A register may hold an instruction, a storage address, or any kind of data (such as a bit sequence or individual characters). A register must be large enough to hold an instruction - for example, in a 64-bit computer, a register must be 64 bits in length. 1. General Purpose Registers General purpose registers are used to store temporary data within the microprocessor. There are eight general purpose registers. They are EAX, EBX, ECX, EDX, EBP, ESI, EDI, ESP. We can refer to the lower 8 and 16 bits of these registers (see image). This is to maintain the backward compatibility of instruction sets. These registers are also known as scratchpad area as they are used by the processor to store intermediate values in a calculation and also for storing address locations. The General Purpose Registers are used for : • EAX: Accumulator Register - Contains the value of some operands in some operations (E.g.: multiplication). • EBX: Base Register - Pointer to some data in Data Segment. • ECX: Counter Register - Acts as loop counter, used in string operations etc. • EDX: Used as pointer to I/O ports. • ESI: Source Index - Acts as source pointer in string operations. It can also act as a pointer in Data Segment (DS). • EDI: Destination Index - Acts as destination pointer in string operations. It can also act as a pointer in Extra Segment (ES). • ESP: Stack Pointer - Always points to the top of system stack. • EBP: Base Pointer - It points to the starting of system stack (ie.bottom/base of stack). 2. Flags and EIP FLAGS are special purpose registers inside the CPU that contains the status of CPU / the status of last operation executed by the CPU. Some of the bits 7 Figure 1.3: i386 Registers 8 Figure 1.4: Segment Registers in FLAGS need special mention: • Carry Flag: When a processor does a calculation, if there is a carry then the Carry Flag will be set to 1. • Zero Flag: If the result of the last operation was zero, Zero Flag will be set to 1, else it will be zero. • Sign Flag : If the result of the last signed operation is negative then the Sign Flag is set to 1, else it will be zero. • Parity Flag: If there are odd number of ones in the result of the last operation, parity flag will be set to 1. • Interrupt Flag: If interrupt flag is set to 1, then only it will listen to external interrupts. EIP: EIP is the instruction pointer, it points to the next instruction to be executed. In memory there are basically two classes of things stored: • Data • Program When we start a program, it will be copied into the main memory and EIP is the pointer which points to the starting of this program in memory and execute each instruction sequentially. Branch statements like JMP, RET, CALL, JNZ (we will see shortly) alter the value of EIP. 3. Segment Registers In x86 processors, for accessing the memory basically there are two types of registers used; Segment Register and Offset. Segment register contains the 9 base address of a particular data section and Offset will contain how many bytes should be displaced from the segment register to access the particular data. CS contains the base address of Code Segment and EIP is the offset. It keeps on updating while executing each instruction. SS or Stack Segment contains the address of top most part of system stack. ESP and EBP will be the offset for that. Stack is a data structure that follows LIFO ie. LastIn-First-Out. There are two main operations associated with stack: push and pop. If we need to insert an element into a stack, we will push it and when we give the pop instruction, we will get the last value which we have pushed. Stack grows downward. So SP will always points to the top of stack and if we push an element, ESP (Stack Pointer) will get reduced by sufficient number of bytes and the data to be stored will be pushed over there. DS, ES, FS and GS acts as base registers for a lot of data operations like array addressing, string operations etc. ESI, EDI and EBX can act as offsets for them. Unlike other registers, Segment registers are still 16 bit wide in 32-bit processors. In modern 32 bit processor the segment address will be just an entry into a descriptor table in memory and using the offset it will get the exact memory locations through some manipulations. This is called segmentation. In x86 architecture when we push or save some data in memory, the lower bytes of it will be addressed immediately and thus it is said to follow Little Endian Form. MIPS architecture follows Big Endian Form. Little Endian and Big Endian Endianness refers to the sequential order in which bytes are arranged into larger numerical values when stored in memory or when transmitted over digital links. Endianness is of interest in computer science because two conflicting and incompatible formats are in common use: words may be represented in big-endian or little-endian format, depending on whether bits or bytes or other components are ordered from the big end (most significant bit) or the little end (least significant bit). Big Endian Byte Order: The most significant byte (the "big end") of the data is placed at the byte with the lowest address. The rest of the data is placed in order in the next three bytes in memory. Little Endian Byte Order: The least significant byte (the "little end") of the data is placed at the byte with the lowest address. The rest of the data is placed in order in the next three bytes in memory. 10 Suppose an integer is stored as 4 bytes(32-bits), then a variable with value 0x01234567( Hexa-decimal representation) is stored as four bytes 0x01, 0x23, 0x45, 0x67, on Big-endian while on Little-Endian (Intel x86), it will be stored in reverse order: Bus Bus is a name given to any communication medium, that transfers data between two components. (A bus is a subsystem that is used to connect computer components and transfer data between them). A bus may be parallel or serial. Parallel buses transmit data across multiple wires. Serial buses transmit data in bit-serial format. We can classify the buses associated with the processor into three. • Data Bus : It is the bus used to transfer data between the processor and memory or any other I/O devices (for both reading and writing). As the size of data bus increases, it can transfer more data in a single stretch. The size of data bus in common processors by Intel are given below. Processor 8088, 80188 8086, 80816, 80286, 80386SX 80386DX, 80486 80586, Pentium Pro and later processors Bus size 8 bit 16 bit 32 bit 64 bit • Address Bus : The address bus is used by the CPU or a direct memory access (DMA) enabled device to locate the physical address to communicate read/write commands. Memory management Unit (MMU) or Memory Control Unit (MCU) is the set of electronic circuits present in the motherboard which helps the processor in reading or writing the data to or from a location in the RAM. All address buses are read and written by the CPU in the form of bits. An address bus is measured by the amount of memory a system can retrieve. A system with a 32-bit address bus can address 4 gigabytes of 11 memory space. The maximum size of RAM which can be used in a PC is determined by the size of the address bus. If the size of address bus is n bits, it can address a maximum of 2n bytes of RAM. This is the reason why even if we add more than 4 GB of RAM in a 32 bit PC, system cannot find more than 4 GB of available memory. Processor Address Bus Width 8088, 8086, 80186, 80188 80386SX, 80286 80486, 80386DX, Pentium, Pentium Override Pentium II, Pentium Pro 20 24 32 Maximum Addressable RAM size 1 MB 16 MB 4 GB 36 64 GB • Control Bus : Control bus contains information which controls the operations of processor or any other memory or I/O device. For example, the data bus is used for both reading and writing purpose and how is it that the Memory or MMU knows the data has to be written to a location or has to be read from a location, when it encounters an address in the address bus? This ambiguity is being cleared using read and write bits in the control bus. When the read bit is enabled, the data in the data bus will be written to that location. When the write bit is enabled, MMU will write the data in the data bus into the address location in the address bus. Interrupts Interrupts are the most critical routines executed by a processor. Interrupts may be triggered by external sources or due to internal operations. In linux based systems 80h is the interrupt number for OS generated interrupts and in windows based systems it is 21h. The Operating System Interrupt is used foI would like to express my gratitude to Saidalavy Kalady Sir and Jayaraj P B Sir (Assistant Professors, Dept: of CSE, NIT Calicut) for guiding me throughout while making this reference material on NASM Assembly language programming. Without their constant support and guidance this work would not have been possible. Thanks to Lyla B Das madam (Associate Professor, Dept: of ECE, NIT Calicut) for encouraging me to bring out an updated version of this work. I would also wish to thank Meera Sudharman, my classmate who helped me in verifying the contents of this document. Special thanks are due to Govind R and Sonia V Mathew (BTech 12 2011-2015 Batch) for updating the contents and adding more working examples to it. I am extremely grateful to Dheeraj Mohan, Nikhil Sojan, Jyothsna Shaji, Lakshmi Alwin, Vrindha K, Nileena P.C., Sanju Alex Jacob and Naveen Babu (2016 - 2020 batch) for restructuring the manual with different learning strategies and working examples. I want to Thanks to all my dear batch mates and juniors who have been supporting me through the work of this and for providing me with their valu- able suggestions.r implementing systems calls. Whenever an interrupt occurs, processor will stop its present work, preserve the values in registers into memory and then execute the ISR (Interrupt Service Routine) by referring to an interrupt vector table. ISR is the set of instructions to be executed when an interrupt occurs. By referring to the interrupt vector table, the processor can get which ISR it should execute for the given interrupt. After executing the ISR, processor will restore the registers to its previous state and continue the process that it was executing before. Almost all the I/O devices work by utilizing the interrupt requests. Here interrupt can be seen as a signal from a device, such as the keyboard, to the CPU, telling processor to immediately stop whatever it is currently doing and do something else. For example, the keyboard controller sends an interrupt when a key is pressed. To know how to call on the kernel when a specific interrupt arise, the CPU has a vector table setup by the OS, and stored in memory. There are 256 interrupt vectors on x86 CPUs, numbered from 0 to 255 which act as entry points into the kernel.The number of interrupt vectors or entry points supported by a CPU differs based on the CPU architecture. System call System calls are Application Programmer’s Interface to the kernel space. In a NASM program, input has to be taken from the standard input device (Keyboard) and output has to be given to the standard output device (monitor). This is implemented using the Operating System’s read and write system call respectively. Interrupt number 80h(in hexadecimal) is given to the software generated interrupt in Linux Systems. Applications invoke the System Calls using this interrupt number. When an application triggers int 80h, then OS will understand that it is a request for a system call and it will refer the general purpose registers to find out and execute the exact Interrupt Service Routine (ie. System Call here). The standard convention to use a system call is, • System call number is stored in EAX register. 13 • Other parameters needed to implement the system call is stored in other general purpose registers. 24 • Trigger the 80h interrupt using the instruction INT 80h. Then OS will implement the system call. System memory in Linux can be divided into two distinct regions: kernel space and user space. Kernel space is where the kernel (i.e., the core of the operating system) executes (i.e., runs) and provides its services. 14 Chapter 2 Introduction to NASM Sections in NASM A NASM program is divided into three sections. 1. section .text : This section contains the executable code from where execution starts. It is analogous to the main() function in C. 2. section .bss : Here, variables are declared without initialisation. 3. section .data : Variables are declared and initialised in this section. For declaring space in the memory the following directives are used, (a) RESx: Reserve just space in memory for a variable without giving any initial values. (b) Dx: Declaring space in the memory for any variable and also providing the initial values at that moment. Where x can be replaced with different characters as shown in the below table. 15 x Meaning b BYTE w WORD d DOUBLE WORD q QUAD WORD t TEN BYTE Bytes 1 2 4 8 20 Examples: section .data var1: db 10 ;Reserve one byte in memory for storing var1 and var1=10 var2 : db 1,2,3,4 string: db ’Hello’ string2: db ’H’,’e’,’l’,’l’,’o’ Here both string and string2 are identical. They are 5 bytes long and stores the string Hello. Each character in the string will be first converted to ASCII code and that numeric value will be stored in each byte location. section .bss var1: resb 1 var2: resq 1 var3: resw 1 TIMES It is used to create and initialize large arrays with a common initial value for all its elements. Eg: var: times 100 db 1 Creates an array of 100 bytes and each element will be initialized with the value 1. Dereferencing in NASM To access the data stored at an address, the dereferencing operator used is ’[ ]’. Examples: mov eax, [var] ;Value at address location var would be copied to eax 16 mov eax,var ;Address location var is copied to eax Type casting It is required for the operands for which the assembler cannot predict the number of memory locations to dereference to get the data(like INC , MOV etc). For other instructions (like ADD, SUB etc) it is not mandatory. The directives used for specifying the datatype are: BYTE, WORD, DWORD, QWORD, TWORD. Eg: MOV dword[ebx], 1 INC BYTE[label] ADD eax, dword[label] x86 Instruction Set 1. MOV Move/Copy Copy the content of one register/memory to another or change the value of a register/ memory variable to an immediate value. Syntax: mov dest, src • src should be a register/memory operand. • Both src and dest cannot together be memory operands. Eg: mov eax, ebx ;Copy the content of ebx to eax mov ecx, 109 ;Changes the value of ecx to 109 mov al, bl mov byte[var1], al ;Copy the content of al register to the variable var1 in memory mov word[var2], 200 mov eax, dword[var3] 2. MOVZX Move and Extend Copy and extend a variable from a lower spaced memory / register location 17 to a higher one syntax : mov src, dest • size of dest should be greater than or equal to size of src. • src should be a register / memory operand. • Both src and dest cannot together be memory operands. • Works only with signed numbers. Eg: movzx eax, ah movzx cx, al 3. ADD - Addition Sytax : add dest, src dest = dest + src; Used to add the values of two registers/memory variables and store the result in the first operand. • src should be a register / memory operand. • Both src and dest cannot together be memory operands. • Both the operands should have the same size. Eg: add add add add eax, ecx al, ah ax, 5 edx, 31h ; eax = eax + ecx ; al = al + ah 4. SUB - Subtraction Sytax : sub dest, src dest = dest - src; Used to subtract the values of two registers/memory variables and store the result in the first operand. • src should be a register/memory operand. • Both src and dest cannot together be memory operands. • Both the operands should have the same size. 18 Eg: sub sub sub sub eax, ecx al, ah ax, 5 edx, 31h ; eax = eax - ecx ; al = al - ah 5. INC - Increment operation Used to increment the value of a registers/memory variables by 1 Eg: INC eax ; eax++ INC byte[var] INC al 6. DEC Used Eg: DEC DEC DEC - Decrement operation to decrement the value of a registers/memory variables by 1 eax ; eax- byte[var] al 7. MUL - Multiplication Syntax : mul src Used to multiply the value of a registers/memory variables with the EAX/AX/AL reg. MUL works according to the following rules. • If src is 1 byte then AX = AL * src. • If src is 1 word (2 bytes) then DX:AX = AX * src (ie. Upper 16 bits of the result (AX*src) will go to DX and the lower 16 bits will go to AX). • If src is 2 words long(32 bit) then EDX:EAX = EAX * src (ie. Upper 32 bits of the result will go to EDX and the lower 32 bits will go to EAX). 8. IMUL - Multiplication of signed numbers IMUL instruction works with the multiplication of signed numbers. It can be used mainly in three different forms. Syntax : (a) imul src 19 (b) imul dest, src (c) imul dest, src1, src2 • If we use imul as in (a) then its working follows the same rules of MUL. • If we use that in (b) form then dest = dest * src. • If we use that in (c) form then dest = src1 * scr2. 9. DIV - Division Synatx : div src Used to divide the value of EDX:EAX or DX:AX or AX register with registers/memory variables in src. DIV works according to the following rules. • If src is 1 byte then AX will be divide by src, remainder will go to AH and quotient will go to AL. • If src is 1 word (2 bytes) then DX:AX will be divided by src, remainder will go to DX and quotient will go to AX. • If src is 2 words long(32 bit) then EDX:EAX will be divide by src, remainder will go to EDX and quotient will go to EAX. 10. NEG - Negation of Signed numbers. Sytax : NEG op1 NEG Instruction negates a given registers/memory variables. 11. CLC - Clear Carry This instruction clears the carry flag bit in CPU FLAGS. 12. ADC - Add with Carry Syntax : ADC dest, src ADC is used for the addition of large numbers. Suppose we want to add two 64 bit numbers. We keep the first number in EDX:EAX (ie. most significant 32 bits in EDX and the others in EAX) and the second number in EBX:ECX. Then we perform addition as follows Eg: clc ; Clearing the carry FLAG add eax, ecx ; Normal addition of eax with ecx adc edx, ebx ; Adding with carry for the higher bits. 13. SBB - Subtract with Borrow 20 Syntax : SBB dest, src SBB is analogous to ADC and it is used for the subtraction of large numbers. Suppose we want to subtract two 64 bit numbers. We keep the first numbers in EDX:EAX and the second number in EBX:ECX. Then we perform subtraction as follows Eg: clc ; Clearing the carry FLAG sub eax, ecx ; Normal subtraction of ecx from eax sbb edx, ebx ; Subtracting with carry for the higher bits. Branching In x86 14. JMP - Unconditionally Jump to label JMP is similar to the goto label statements in C/C++. It is used to jump control to any part of our program without checking any conditions. 15. CMP - Compares the Operands Syntax :CMP op1, op2 When we apply CMP instruction over two operands say op1 and op2, it will perform the operation op1 − op2 and will not store the result. Instead it will affect the CPU FLAGS. It is similar to the SUB operation, without saving the result. For example if op1 6= op2 then the Zero Flag(ZF) will be set to 1. NB: For generating conditional jumps in X86 programming we will first perform the CMP operation between two registers/memory operands and then we use the following jump operations which checks the CPU FLAGS. Conditional Jump Instructions: 21 Instruction Working JZ Jump If Zero Flag is Set JNZ Jump If Zero Flag is Unset JC Jump If Carry Flag is Set JNC Jump If Carry Flag is Unset JP Jump If Parity Flag is Set JNP Jump If Parity Flag is Unset JO Jump If Overflow Flag is Set JNO Jump If Overflow Flag is Unset Advanced Conditional Jump Instructions: In 80x86 processors Intel has added some enhanced versions of the conditional operations which are much more easier than traditional Jump instructions. They are easy to perform comparison between two variables. First we need to use CMP op1, op2 before even using these set of Jump instructions. There is separate class for comparing the signed and unsigned numbers. (a) For Unsigned numbers: Instruction Working JE Jump if op1 = op2 JNE Jump if op1 6= op2 JA (jump if above) Jump if op1 >op2 JNA Jump if op1 ≤ op2 JB (jump if below) Jump if op1op2 JNG Jump if op1 ≤ op2 JL (jump if below) Jump if op1 = 0 ) sum = sum + ecx; ecx − −; x86 equivalent mov dword[sum], 0 mov ecx, dword[n] addition: add [sum], ecx loop addition ; Decrements ecx and checks if ecx is not equal to 0 , if so it will jump to addition Boolean Operators 17. AND - (Bitwise Logical AND) Syntax : AND op1, op2 Performs bitwise logical AND operation of op1 and op2, assign the result to op1. op1 = op1&op2; //Equivalent C Statement Let x = 10101001b and y = 10110010b be two 8-bit binary numbers. Then x&y x 1 0 1 0 1 0 0 1 y 1 0 1 1 0 0 1 0 x AND y 1 0 1 0 0 0 0 0 18. OR - (Bitwise Logical OR) Syntax: OR op1, op2 Performs bitwise logical OR operation of op1 and op2, assign the result to op1. op1 = op1||op2; //Equivalent C Statement Let x = 10101001b and y = 10110010b be two 8-bit binary numbers. Then xky x 1 0 1 0 1 0 0 1 y 1 0 1 1 0 0 1 0 x OR y 1 0 1 1 1 0 1 1 19. XOR - (Bitwise Logical Exclusive OR) Syntax: XOR op1, op2 24 Performs bitwise logical XOR operation of op1 and op2, assign the result to op1. op1 = op1 ˆ op2; //Equivalent C Statement Let x = 10101001b and y = 10110010b be two 8-bit binary numbers. Then x⊕y x y x XOR y 1 0 1 0 1 0 0 1 1 0 1 1 0 0 1 0 0 0 0 1 1 0 1 1 20. NOT - (Bitwise Logical Negation) Syntax: NOT op1 Performs bitwise logical NOT of op1 and assign the result to op1. op1 = ¬op1; //Equivalent C Statement x 1 0 1 0 1 0 0 1 ¬x 0 1 0 1 0 1 1 0 21. TEST - (Logical AND, affects only CPU FLAGS) Syntax: TEST op1, op2 • It performs the bitwise logical AND of op1 and op2 but it wont save the result to any registers. Instead, the result of the operation will affect CPU FLAGs. • It is similar to the CMP instruction in usage. 22. SHL - Shift Left Syntax : SHL op1, op2 op1 = op1 «op2; //Equivalent C Statement • SHL performs the bitwise left shift. op1 should be a registers/memory variables but op2 must be an immediate(constant) value. • It will shift the bits of op1, op2 number of times towards the left and put the rightmost op2 number of bits to 0. Example: shl al,3 al 1 0 1 0 1 0 0 1 al«3 0 1 0 0 1 0 0 0 25 23. SHR - Shift Right Syntax : SHR op1, op2 op1 = op1»op2; //Equivalent C Statement • SHR performs the bitwise right shift. op1 should be a registers/memory variables but op2 must be an immediate(constant) value. • It will shift the bits of op1, op2 number of times towards the right and put the leftmost op2 number of bits to 0. Example: shr al,3 al 1 0 1 0 1 0 0 1 al»3 0 0 0 1 0 1 0 1 24. ROL - Rotate Left Syntax : ROL op1, op2 ROL performs the bitwise cyclic left shift. op1 could be a registers/memory variables but op2 must be an immediate(constant) value. Example : rol al,3 al rol al,3 1 0 1 0 1 0 0 1 0 1 0 0 1 1 0 1 25. ROR - Rotate Right Syntax : ROR op1, op2 ROR performs the bitwise cyclic right shift. op1 could be a registers/memory variables but op2 must be an immediate(constant) value. Example: ror eax, 5 al ror al,3 1 0 1 0 1 0 0 1 0 0 1 1 0 1 0 1 26. RCL - Rotate Left with Carry Syntax : RCL op1, op2 Its working is same as that of rotate left except it will consider the carry bit as its left most extra bit and then perform the left rotation. 27. RCR - Rotate Right with Carry Syntax : RCR op1, op2 26 Its working is same as that of rotate right except it will consider the carry bit as its left most extra bit and then perform the right rotation. Stack Operations 28. PUSH Pushes a value into system stack. It decreases the value of ESP and copies the value of a register/constant into the system stack Syntax: PUSH register/constant Examples PUSH ax ; ESP = ESP − 2 and copies value of ax to [ESP] PUSH eax ; ESP = ESP − 4 and copies value of ax to [ESP] PUSH ebx PUSH dword 5 PUSH word 258 29. POP Pops off a value from the system stack.POP Instruction takes the value stored in the top of system stack to a register and then increases the value of ESP. Examples: POP bx ; ESP= ESP + 2 and copies value from stack to bx POP ebx ; ESP= ESP + 4 POP eax 30. PUSHA Pushes the value of all general purpose registers. PUSHA is used to save the value of general purpose registers especially when calling some subprograms which will modify their values. 31. POPA Pops off the value of all general purpose registers which we have pushed before using PUSHA instruction 32. PUSHF Pushes all the CPU FLAGS 33. POPF 27 POP off and restore the values of all CPU Flags which have been pushed before using PUSHF instructions. NB: It is important to pop off the values pushed into the stack properly. Even a minute mistake in any of the PUSH / POP instruction could make the program not working. 34. Pre-processor Directives in NASM: In NASM %define acts similar to the C 0 s preprocessor directive define. This can be used to declare constants. Eg: %define SIZE 100 Comments in NASM Only single line comments are available in NASM. A semicolon (;) is inserted in front of the line to be commented. Ex: ; A program to find the largest of 2 numbers NASM Installation NASM is freely available on internet. You can visit www.nasm.us . It’s documentation is also available there. In order to install NASM in windows you can download it as an installation package from the site and install it easily. In Ubuntu Linux you can give the command : sudo apt-get install nasm and in fedora you can use the command: su -c yum install nasm in a terminal and easily install nasm. Compilation 1. Assembling the source file nasm -f elf filename.asm This creates an object file, filename.o in the current working directory. 28 2. Creating an executable file For a 32 bit machine ld filename.o -o output filename For 64 bit machine ld -melf_i386 filename.o -o output filename This creates an executable file of the name output filename. 3. Program execution ./output filename For example, if the program to be run is first.asm nasm -f elf first.asm ld first.o -o output ./output 29 Chapter 3 Basic I/O in NASM In this chapter, we will learn how to obtain user input and how to return the output to the output device. The input from the standard input device (Keyboard) and Output to the standard output device (monitor) in a NASM Program is implemented using the Operating SystemâĂŹs read and write system call. Interrupt no: 80h is given to the software generated interrupt in Linux Systems. Applications implement the System Calls using this interrupt. When an application triggers int 80h, then OS will understand that it is a request for a system call and it will refer the general purpose registers to find out and execute the exact Interrupt Service Routine (ie. System Call here).The standard convention to use the software 80h interrupt is, we will put the system call no: in eax register and other parameters needed to implement the system calls in the other general purpose registers. Then we will trigger the 80h interrupt using the instruction âĂŸINT 80hâĂŹ. Then OS will implement the system call. 1. Exit System Call • This system call is used to exit from the program • System call number for exit is 1, so it is copied to eax reg. • Output of a program if the exit is successful is 0 and it is being passed as a parameter for exit( ) system call. We need to copy 0 to ebx reg. • Then we will trigger INT 80h mov eax, 1 mov ebx, 0 int 80h ; System Call Number ; Parameter ; Triggering OS Interrupt 30 2. Read System Call • Using this we could read only string/character • System Call Number for Read is 3. It is copied to eax. • The standard Input device(keyboard) is having the reference number 0 and it must be copied to ebx reg. • We need to copy the pointer in memory, to which we need to store the input string to ecx reg. • We need to copy the number of characters in the string to edx reg. • Then we will trigger INT 80h. • We will get the string to the location which we copied to ecx reg. mov mov mov mov int eax, ebx, ecx, edx, 80h 3 0 var dword[size] ; Sys_call number for read ; Source Keyboard ; Pointer to memory location ; Size of the string ; Triggering OS Interrupt • This method is also used for reading integers and it is a bit tricky. If we need to read a single digit, we will read it as a single character and then subtract 30h from it(ASCII of 0 = 30h). Then we will get the actual value of that number in that variable. (a) Reading a single digit number mov mov mov mov int sub eax, 3 ebx, 0 ecx, digit1 edx, 1 80h byte[digit1], 30h ;Now we have the actual number in [digit1] (b) Reading a two digit number 31 ;Reading mov eax, mov ebx, mov ecx, mov edx, int 80h first digit 3 0 digit1 1 ;Reading mov eax, mov ebx, mov ecx, mov edx, int 80h second digit 3 0 digit2 2 ;Here we put 2 because we need to read and omit enter key press as well sub byte[digit1], 30h sub byte[digit2], 30h ;Getting the number from ASCII ; num = (10* digit1) + digit2 mov al, byte[digit1] mov bl, 10 mul bl movzx bx, byte[digit2] add ax, bx ; Copying first digit to al ; Multiplying al with 10 ; Copying digit2 to bx mov byte[num], al ; We are sure that no less than 256, so we can omit higher 8 bits of the result. 3. Write System Call • This system call can be used to print the output to the monitor • Using this we could write only string / character • System Call Number for Write is 4. It is copied to eax. • The standard Output device(Monitor) is having the reference number 1 and it must be copied to ebx reg. • We need to copy the pointer in memory, where the output sting resides 32 to ecx reg. • We need to copy the number of characters in the string to edx reg. • Then we will trigger INT 80h. See the below given example which is equivalent to the C statement printf() Eg: mov mov mov mov int eax, ebx, ecx, edx, 80h 4 1 msg1 size1 ;Sys_call number ;Standard Output device ;Pointer to output string ;Number of characters ;Triggering interrupt. • This method is even used to output numbers. If we have a number we will break that into digits. Then we keep each of that digit in a variable of size 1 byte. Then we add 30h (ASCII of 0) to each, doing so we will get the ASCII of character to be print. 33 Chapter 4 Introduction to Programming in NASM Now that we have gone through all the basics required for creating a program, let’s begin. Like all programming lessons, we will first learn how to create a Hello World program. Programming in NASM is easy if you understand how each construct is used and implemented. A transition from C like language can be frustrating at first, but let’s take it step by step. We will learn to program by translating code snippets in C language to assembly code. Hello World program As discussed earlier, the whole program is divided into 3 sections, namely code section (section .text), section for uninitalised variables (section .bss) and section for initialised variables (section .data). section .text is the place from where the execution starts in NASM program, analogous to the main() function in C-Programming. 1. Let’s store the string "Hello World" into a variable named str. We are going to print that string as the output Pseudo Code: char str[13] = "Hello World"; NASM Code: section .data ;For Storing Initialized Variables 34 string: db ’Hello World’, 0Ah newline character length: equ 13 ;String Hello World followed by a ; Length of the string stored to a constant. NB: Using equ we declare constants, ie. their value wonâĂŹt change during execution. $ -string will return the length of string variables in bytes (ie. number of characters) 2. Now we need to print this variable onto the screen. For that we use the write system call. Pseudo Code: printf("%s",str); NASM Code: mov mov mov mov int eax, ebx, ecx, edx, 80h 4 1 string length ; Using int 80h to implement write() sys_call 3. Now since we have printed the text, let’s exit the program. We use the exit system call for the same Pseudo Code: return 0; NASM Code: ;Exit System Call mov eax, 1 mov ebx, 0 int 80h ; Length of the string stored to a constant. 4. section .text is the place from where the execution starts in NASM program, analogous to the main() function in C-Programming. So let’s put the print and exit parts into the section The final program looks like this: section .text global _start: ; Code Section 35 _start: mov eax, mov ebx, mov ecx, mov edx, int 80h 4 1 string length ; Using int 80h to implement write() sys_call ;Exit System Call mov eax, 1 mov ebx, 0 int 80h section string: newline length: .data db ’Hello World’, 0Ah character equ 13 ;For Storing Initialized Variables ;String Hello World followed by a ; Length of the string stored to a constant. Compilation 1. Assembling the source file nasm -f elf filename.asm This creates an object file, filename.o in the current working directory. 2. Creating an executable file For a 32 bit machine ld filename.o -o output filename For 64 bit machine ld -melf_i386 filename -o output filename This creates an executable file of the name output filename. 3. Program execution ./output filename For example, if the program to be run is first.asm nasm -f elf first.asm ld first.o -o output ./output 36 It doesn’t seem as difficult as you felt in the beginning, right? In the same pattern, we’ll see how both simple and complex constructs are formed in assembly language. 37 Chapter 5 Integer Handling Integers are stored as characters in NASM. This chapter describes how to read and print integers. We have already skimmed through the basic operations in integer handling (add, sub, mul, div) in Chapter 3. Here, we will look at sample programs that will make use of these operations. 1. Declaring an integer section.bss var: resb/resw/resd 10 Here we have reserved 10 bytes/words/double words for the memory location ’var’ 2. Reading a single digit To read a single digit, we will read it as a single character and then subtract 30h from it (ASCII of 0 = 30h). This is to get the actual value of that number in that variable. mov mov mov mov int sub eax, 3 ; Syscall number for read ebx, 0 ; Source Keyboard ecx, var ; Pointer to memory location âĂŹvarâĂŹ edx, byte[size] ; Size of the input 80h ; Triggering OS Interrupt byte[var], 30h ; Subtracting 30h to get integer value Note that the single digit taken as input here is stored in the first byte of ’var’. If ’var’ was declared as a word/double word, the size of the input stored 38 in the edx register should be changed accordingly to word[size] or dword[size]. 3. Reading a multi-digit number To read a multi-digit number, we read the input character by character (here, digit by digit) until we obtain the enter-key. Each digit is multiplied with 10 and added to the next digit to obtain the original number. - Pseudo Code while(temp!=enterkey) num=0 read(temp) num=numx10+temp endwhile NASM Code: read_num: ;;push all the used registers into the stack using pusha pusha ;;store an initial value 0 to variable âĂŹnumâĂŹ mov word[num], 0 loop_read: ;; read a digit mov eax, 3 mov ebx, 0 mov ecx, temp mov edx, 1 int 80h ;;check if the read digit is the end of number, i.e, the enter-key whose ASCII cmp byte[temp], 10 je end_read mov ax, word[num] mov bx, 10 mul bx mov bl, byte[temp] sub bl, 30h mov bh, 0 add ax, bx 39 mov word[num], ax jmp loop_read end_read: ;;pop all the used registers from the stack using popa popa ret 4. Printing a number Unlike when we’re reading a number, there is no end character like the enterkey marking the number’s last digit. Here, we extract each digit from the number and push it to the stack. We also keep a count of the number of digits in the number. Using this count, the digits are popped out of the stack and printed in order. - Pseudo Code count=0 while (num!=0) temp=num%10 count++ push(temp) num=num/10 endwhile while(count!=0) temp=pop() print(temp) count- endwhile NASM Code: print_num: mov byte[count],0 pusha extract_no: cmp word[num], 0 je print_no inc byte[count] mov dx, 0 mov ax, word[num] 40 mov bx, 10 div bx push dx mov word[num], ax jmp extract_no print_no: cmp byte[count], 0 je end_print dec byte[count] pop dx mov byte[temp], dl add byte[temp], 30h mov eax, 4 mov ebx, 1 mov ecx, temp mov edx, 1 int 80h jmp print_no end_print: mov eax,4 mov ebx,1 mov ecx,newline mov edx,1 int 80h ;;The memory location âĂŹnewlineâĂŹ should be declared with the ASCII key for popa ret Example Program: Read 2 numbers and find their GCD. Pseudo Code: read(num1) read(num2) while(1) if(num1%num2==0) goto endloop 41 temp=num1%num2 num1=num2 num2=temp endwhile endloop: print(num2) NASM Code section .data newline: db 10 section .bss num1:resw 10 num2 resw 10 temp:resb 10 num:resw 10 nod:resb 10 count:resb 10 section .text global _start _start: call read_num mov cx,word[num] mov word[num1],cx call read_num mov cx,word[num] mov word[num2],cx mov ax,word[num1] mov bx,word[num2] loop1: mov dx,0 div bx cmp dx,0 je end_loop mov ax,bx mov bx,dx jmp loop1 42 end_loop: mov word[num],bx call print_num exit: mov eax,1 mov ebx,0 int 80h Do not forget to add the read_num, print_num functions 43 Chapter 6 Subprograms We have now covered the basics of programming in assembly language. Often we might have to repeat the same code in the program and that is when we use functions in languages like C. Let’s see how it is done in assembly language. The concept of subprograms can be used heavily in assembly language, since input and output operations require more than one line of code. Making them into subprograms not only makes the code smaller, but also helps you understand and debug the code better. It is highly reccommended that you start using subprograms in your code from the very start of programming in assembly language. CALL & RET Statements: In NASM, subprograms are implemented using the call and ret instructions. The general syntax is as follows: ;main code.... ------------------------call function_name ------------------------;rest of the code.... function_name: ;Label for subprogram 44 ------------------;function code.... ------------------ret • When we use the CALL instruction, address of the next instruction will be copied to the system stack and it will jump to the subprogram. ie. ESP will be decreased by 4 units and address of the next instruction will go over there. • When we call the ret towards the end of the sub-program then the address being pushed to the top of the stack will be restored and control will jump to that. Calling Conventions: The set of rules that the calling program and the subprogram should obey to pass parameters are called calling conventions. Calling conventions allow a subprogram to be called at various parts of the code with different data to operate on. The data may be passed using system registers/memory variables/system stack. If we are using system stack, parameters should be pushed to system stack before the CALL statement and remember to pop off, preserve and to push back the return address within the sub program which will be in the top of the stack. Now, let’s see an example on how to use subprograms effectively to simplify your code. Below is a program which uses subprograms to calculate the sum of 10 input integers. NB: One major advantage of subprograms that you can exploit is by creating subprograms for input and output, as those components appear in almost all the programs you are going to write. As the codes are quite lengthy, repeating the code for each input and output will make your code even lengthier, making debugging and understanding the code a tedious task. In the program given below, three subprograms are used, one for input, one for output and the last one for calculating the sum. It is suggested that you use subprograms for input and ouput from now on. section .data msg1: db 0Ah,"Enter a number" size1: equ $-msg1 msg2: db 0Ah,"the sum= " 45 size2: equ $-msg2 newline: db 0Ah section .bss num: resw 1 sum: resw 1 digit1: resb 1 digit0: resb 1 digit2: resb 1 temp: resb 1 counter:resb 1 count: resb 1 section .text global _start _start: mov byte[counter],0 mov byte[sum],0 read_and_add: call read_num ;; Instead of the code for reading a number, ;; we just call the subprogram call add ;loop condition checking inc byte[counter] cmp byte[counter],10 jl read_and_add mov ax, word[sum] mov word[num], ax call print_num ;; Instead of the code for printing a number, ;; we just call the subprogram 46 ;exit mov eax,1 mov ebx,0 int 80h ;sub-program add to add a number to the existing sum add: mov ax,word[num] add word[sum],ax ret ;;subprogram to print a number print_num: mov byte[count],0 pusha extract_no: cmp word[num], 0 je print_no inc byte[count] mov dx, 0 mov ax, word[num] mov bx, 10 div bx push dx mov word[num], ax jmp extract_no print_no: cmp byte[count], 0 je end_print dec byte[count] pop dx mov byte[temp], dl add byte[temp], 30h mov eax, 4 mov ebx, 1 mov ecx, temp mov edx, 1 int 80h jmp print_no 47 end_print: mov eax,4 mov ebx,1 mov ecx,newline mov edx,1 int 80h ;;The memory location âĂŹnewlineâĂŹ should be declared with the ASCII key for ;;new line in section.data. popa ret ;; subprogram for inputting a number read_num: ;;push all the used registers into the stackusing pusha pusha ;;store an initial value 0 to variable âĂŹnumâĂŹ mov word[num], 0 loop_read: ;; read a digit mov eax, 3 mov ebx, 0 mov ecx, temp mov edx, 1 int 80h ;;check if the read digit is the end of number,i.e, the enter-key whose ASCII key i cmp byte[temp], 10 je end_read mov ax, word[num] mov bx, 10 mul bx mov bl, byte[temp] sub bl, 30h mov bh, 0 add ax, bx mov word[num], ax jmp loop_read end_read: 48 ;;pop all the used registers from the stackusing popa popa ret Recursive Sub-routine: A subprogram which calls itself again and again recursively to calculate the return value is called recursive sub-routine. We could implement recursive sub-routine for operations like calculating factorial of a number very easily in NASM. A program to print fibonacci series up to a number using recursive subprogram is given in the appendix. Using C Library functions in NASM: We can embed standard C library functions into our NASM Program especially for I/O operations. If we would like to use that then we have to follow Cś calling conventions given below: • parameters are passed to the system stack from left to right order. Eg: printf(âĂIJ%dâĂİ,x) Here value of x must be pushed to system stack and then the format string. • C-Function wonâĂŹt pop out the parameters automatically, so it is our duty to restore the status of stack pointers(ESP and EBP) after the function being called. Eg: Reading an integer using the C-Functions... section .text global main ;Declaring the external functions to be used in the program....... extern scanf extern printf ;Code to read an integer using the scanf function getint: push ebp ;Steps to store the stack pointers mov ebp , esp 49 ;scanf(âĂIJ\%dâĂİ,&x) ;Creating a space of 2 bytes on top of stack to store the int value sub esp , 2 lea eax , [ ebp-2] push eax push fmt1 call scanf mov ax, word [ebp-2] mov word[num], ax ; Pushing the address of that location ; Pushing format string ; Calling scanf function ;Restoring the stack registers. mov esp , ebp pop ebp ret putint: push ebp mov ebp , esp ;printf(âĂIJ%dâĂİ,x) sub esp , 2 mov ax, word[num] mov word[ebp-2], ax push fmt2 call printf mov esp , ebp pop ebp ret ; Steps to store the stack pointers ; Creating a space of 2 bytes and storing the int value t ; Pushing pointer to format string ; Calling printf( ) function ; Restoring stack to initial values main: ; Main( ) Function mov eax, 4 mov ebx, 1 mov ecx, msg1 mov edx, size1 int 80h call getint mov ax, word[num] 50 mov bx, ax mul bx mov word[num], ax call putint exit: mov ebx , 0 mov eax, 1 int 80h section .data fmt1 db "\%d",0 fmt2 db "Square of the number is : %\d",10 msg1: âĂIJEnter an integer : " size1: db $-msg1 section .bss num: resw 1 NB: Assembling and executing the code... • First we have to assemble the code to object code with NASM Assembler, then we have to use gcc compiler to make the executable code. • nasm -f elf âĂŞo int.o int.asm • gcc int.o -o int • ./int 51 Chapter 7 Arrays and Strings An Array is a continuous storage block in memory. Each element of the array have the same size. We access each element of the array using: i) Base address/address of the first element of the array. ii) Size of each element of the array. iii) Index of the element we want to access. 7.1 One Dimensional Arrays In NASM there is no array element accessing/dereferencing operator like [ ] in C / C++ / Java using which we can access each element. Here we compute the address of each element using an iterative control structure and traverse though the elements of the array. 1. Declaring/ Initializing an array section .bss arr: resb/resw/resd 50 array1: db 2, 5, 8, 10, 12, 15 array2: dw 191, 122, 165, 165 array3: dd 111, 111, 111 52 ;Declares an array of size 50 ;Initializes array ;{2,5,8,10,12,15} ;An array of 4 words ;An array of 4 dwords We can also use TIMES keyword to repeat each element with a given value and thus easily create array elements: Eg: array1: TIMES with each element=1 array2: TIMES 100 db 1 ; An array of 100 bytes 20 2 ; An array of 20 dwords dw 2. Reading an n-size array Pseudo Code: i=0 while(i
Source Exif Data:File Type : PDF File Type Extension : pdf MIME Type : application/pdf PDF Version : 1.5 Linearized : Yes Create Date : 2019:01:17 04:32:12Z Creator : TeX Modify Date : 2019:01:17 04:32:12Z PTEX Fullbanner : This is pdfTeX, Version 3.14159265-2.6-1.40.18 (TeX Live 2017) kpathsea version 6.2.3 Producer : pdfTeX-1.40.18 Trapped : False Page Count : 104EXIF Metadata provided by EXIF.tools