Module 2 - Introduction to AVR Assembly

1. Introduction to AVR Assembly Language
Assembly is a low-level programming language that allows manipulation of every bit in memory, resulting in highly efficient and fast code. It has a strong one-to-one correspondence with the machine code instructions of the computer architecture. 
 On Arduino microcontrollers (specifically the ATmega328P), Assembly programming enables high-level control suitable for real-time systems and applications requiring complex mathematical processes. 
 Advantages of Using Assembly: 
 
 High efficiency: Full control over memory usage and execution time. 
 Deep understanding: Helps understand fundamental microcontroller operations. 
 Problem solving: Can solve problems that may arise in other high-level languages. 
 
 Disadvantages: 
 
 Steep learning curve: Requires deep understanding of hardware architecture. 
 Longer code: For simple tasks, Assembly code is much longer compared to high-level languages.

2. ATmega328P Hardware & Memory Architecture
A. Memory Map 
 The ATmega328P memory map provides information on how the Microcontroller Unit (MCU) uses memory. Here is the address division: 
 
 
 
 Category 
 Address 
 Size 
 Description 
 
 
 
 
 General Purpose Registers 
 0x0000 - 0x001F 
 32 x 8 bit 
 Registers R0 - R31 
 
 
 I/O Registers 
 0x0020 - 0x005F 
 64 x 8 bit 
 Accessible via IN / OUT instructions 
 
 
 Extended I/O Registers 
 0x0060 - 0x00FF 
 160 x 8 bit 
 Additional I/O registers 
 
 
 Internal SRAM 
 0x0100 - 0x08FF 
 2048 x 8 bit 
 Internal data memory 
 
 
 
 B. General Purpose Working Registers (GPR) 
 The AVR architecture has 32 general-purpose registers labeled R0 through R31 . These registers function as temporary storage for data during processing and are directly connected to the ALU (Arithmetic Logic Unit). 
 Register Division: 
 
 
 
 Group 
 Registers 
 Characteristics 
 
 
 
 
 Lower Registers 
 R0 - R15 
 Limited functionality. Cannot store immediate values directly (cannot use LDI instruction). 
 
 
 Upper Registers 
 R16 - R31 
 More flexible. Can work with immediate data, allowing direct storage of bytes or words. 
 
 
 
 Pointer Registers: 
 The last six registers (R26 through R31) can be combined into 16-bit pointers for indirect memory addressing: 
 
 
 
 Pointer Name 
 Low Register 
 High Register 
 Function 
 
 
 
 
 X Register 
 R26 (XL) 
 R27 (XH) 
 Pointer for memory access 
 
 
 Y Register 
 R28 (YL) 
 R29 (YH) 
 Pointer for memory access 
 
 
 Z Register 
 R30 (ZL) 
 R31 (ZH) 
 Pointer for memory & flash access

3. Input/Output (I/O) Programming
On the Arduino Uno (ATmega328P), digital I/O is controlled through Port B, Port C, and Port D . Each port is 8-bit, allowing control of up to 8 pins simultaneously. 
 
 A. Port to Arduino Pin Mapping 
 
 
 
 Port 
 Bits 
 Arduino Pin 
 Notes 
 
 
 
 
 Port B 
 PB0 - PB5 
 Digital Pin 8 - 13 
 PB6-PB7 are used for crystal oscillator 
 
 
 Port C 
 PC0 - PC5 
 Analog Pin A0 - A5 
 PC6 is the RESET pin 
 
 
 Port D 
 PD0 - PD7 
 Digital Pin 0 - 7 
 PD0 (RX) and PD1 (TX) for serial communication 
 
 
 
 B. Main I/O Registers 
 Three main registers control the behavior of each port: 
 
 
 
 Register 
 Full Name 
 Access 
 Function 
 
 
 
 
 DDRx 
 Data Direction Register 
 Read/Write 
 Configures pin direction. 0 = Input, 1 = Output 
 
 
 PORTx 
 Data Register 
 Read/Write 
 If Output: Sets logic High (1) or Low (0). If Input: Activates internal Pull-up resistor (1) or Tri-state (0) 
 
 
 PINx 
 Input Pins Address 
 Read Only 
 Reads the physical logic state of the pin ( 0 or 1 ) 
 
 
 
 (Replace 'x' with Port name, e.g., DDRB, PORTB, PINB) 
 C. Register Bit Configuration Details 
 DDRx - Data Direction Register 
 
 
 
 DDRx Bit Value 
 Pin Direction 
 Explanation 
 
 
 
 
 0 
 Input 
 Pin is configured as input (high impedance) 
 
 
 1 
 Output 
 Pin is configured as output (source/sink current) 
 
 
 
 PORTx - Data Register (Depends on DDRx Configuration) 
 
 
 
 DDRx 
 PORTx 
 Mode 
 Pin Condition 
 
 
 
 
 0 (Input) 
 0 
 Tri-state (Hi-Z) 
 Pin is floating, no pull-up 
 
 
 0 (Input) 
 1 
 Input Pull-up 
 Internal pull-up resistor active, pin defaults to HIGH 
 
 
 1 (Output) 
 0 
 Output Low 
 Pin outputs 0V (GND) 
 
 
 1 (Output) 
 1 
 Output High 
 Pin outputs 5V (VCC) 
 
 
 
 PINx - Input Pins Register 
 
 
 
 PINx Bit Value 
 Pin Status 
 Explanation 
 
 
 
 
 0 
 LOW 
 Pin voltage is below threshold (near 0V) 
 
 
 1 
 HIGH 
 Pin voltage is above threshold (near 5V)

4. Assembly Integration with Arduino IDE
To combine Assembly with Arduino C++ code, the extern "C" directive is used in the .ino file and the .global directive is used in the .S (Assembly) file. 
 File Structure: 
 .ino File (C/C++): 
 extern "C" {
 void start(); // Declaration of function defined in Assembly
 void loop_asm(); // Another function from Assembly
}

void setup() {
 start(); // Call Assembly function for initialization
}

void loop() {
 loop_asm(); // Call Assembly function for main loop
}
 
 .S File (Assembly): 
 #define __SFR_OFFSET 0x00
#include "avr/io.h"

.global start
.global loop_asm

start:
 SBI DDRB, 5 ; Set PB5 (Pin 13) as Output
 RET ; Return to caller

loop_asm:
 SBI PORTB, 5 ; Turn on LED
 ; ... other code
 RET
 
 Directive Explanations: 
 
 #define __SFR_OFFSET 0x00 : Sets the offset for I/O registers to use symbolic names (DDRB, PORTB, etc.). 
 #include "avr/io.h" : Includes register definitions for the AVR chip. 
 .global : Makes label/function accessible from other files (exported symbol). 
 RET : Instruction to return from subroutine to the calling program.

5. AVR Assembly Instruction Set
Operand Notation 
 Before diving into the instructions, here are the common operand symbols used: 
 
 
 
 Symbol 
 Description 
 
 
 
 
 Rd 
 Destination register (R0-R31). The result of the operation is stored here. 
 
 
 Rr 
 Source register (R0-R31). Used as input for the operation. 
 
 
 K 
 Constant/Immediate value (8-bit: 0-255 or 0x00-0xFF). 
 
 
 k 
 Address constant for SRAM or program memory. 
 
 
 A 
 I/O register address (0-63 for IN/OUT, 0-31 for SBI/CBI). 
 
 
 b 
 Bit number (0-7) within a register or I/O address. 
 
 
 X, Y, Z 
 Pointer registers (X=R27:R26, Y=R29:R28, Z=R31:R30). 
 
 
 
 Note: Some instructions only work with upper registers (R16-R31), such as LDI , ANDI , ORI , SUBI , SBCI , and CPI . 
 A. Data Transfer Instructions 
 Used to move data between registers or between registers and memory/I/O. 
 
 
 
 Mnemonic 
 Operand 
 Description 
 Example 
 Notes 
 
 
 
 
 LDI 
 Rd, K 
 Load Immediate 
 LDI R16, 0xFF 
 Loads 8-bit constant K into register Rd (R16-R31 only) 
 
 
 MOV 
 Rd, Rr 
 Move/Copy Register 
 MOV R0, R1 
 Copies contents of register Rr to Rd 
 
 
 IN 
 Rd, A 
 Input from I/O 
 IN R16, PINB 
 Reads data from I/O port A to register Rd 
 
 
 OUT 
 A, Rr 
 Output to I/O 
 OUT PORTB, R16 
 Sends data from register Rr to I/O port A 
 
 
 LDS 
 Rd, k 
 Load from SRAM 
 LDS R16, 0x0100 
 Loads data from SRAM address k to register Rd 
 
 
 STS 
 k, Rr 
 Store to SRAM 
 STS 0x0100, R16 
 Stores register Rr contents to SRAM address k 
 
 
 LD 
 Rd, X/Y/Z 
 Load Indirect 
 LD R16, X 
 Loads data from address pointed by pointer X/Y/Z 
 
 
 ST 
 X/Y/Z, Rr 
 Store Indirect 
 ST X, R16 
 Stores data to address pointed by pointer X/Y/Z 
 
 
 PUSH 
 Rr 
 Push to Stack 
 PUSH R16 
 Saves register to stack 
 
 
 POP 
 Rd 
 Pop from Stack 
 POP R16 
 Retrieves data from stack to register 
 
 
 
 B. Bit Manipulation Instructions (I/O Specific) 
 These instructions operate on the lower 32 I/O addresses ($00-$1F). Very efficient for changing one bit without affecting other bits. 
 
 
 
 Mnemonic 
 Operand 
 Description 
 Example 
 Notes 
 
 
 
 
 SBI 
 A, b 
 Set Bit in I/O 
 SBI DDRB, 5 
 Sets bit b in I/O register A to 1 
 
 
 CBI 
 A, b 
 Clear Bit in I/O 
 CBI PORTB, 5 
 Clears bit b in I/O register A to 0 
 
 
 BST 
 Rr, b 
 Bit Store to T 
 BST R16, 3 
 Copies bit b from register Rr to T flag 
 
 
 BLD 
 Rd, b 
 Bit Load from T 
 BLD R17, 5 
 Copies T flag to bit b of register Rd 
 
 
 
 C. Arithmetic Instructions 
 
 
 
 Mnemonic 
 Operand 
 Description 
 Example 
 Notes 
 
 
 
 
 ADD 
 Rd, Rr 
 Add 
 ADD R1, R2 
 Rd = Rd + Rr 
 
 
 ADC 
 Rd, Rr 
 Add with Carry 
 ADC R1, R2 
 Rd = Rd + Rr + C (Carry flag) 
 
 
 SUB 
 Rd, Rr 
 Subtract 
 SUB R16, R17 
 Rd = Rd - Rr 
 
 
 SBC 
 Rd, Rr 
 Subtract with Carry 
 SBC R16, R17 
 Rd = Rd - Rr - C 
 
 
 SUBI 
 Rd, K 
 Subtract Immediate 
 SUBI R16, 10 
 Rd = Rd - K (R16-R31 only) 
 
 
 SBCI 
 Rd, K 
 Subtract Immediate with Carry 
 SBCI R17, 0 
 Rd = Rd - K - C 
 
 
 INC 
 Rd 
 Increment 
 INC R16 
 Rd = Rd + 1 
 
 
 DEC 
 Rd 
 Decrement 
 DEC R16 
 Rd = Rd - 1 
 
 
 MUL 
 Rd, Rr 
 Multiply Unsigned 
 MUL R16, R17 
 R1:R0 = Rd × Rr (16-bit result) 
 
 
 MULS 
 Rd, Rr 
 Multiply Signed 
 MULS R16, R17 
 R1:R0 = Rd × Rr (signed) 
 
 
 NEG 
 Rd 
 Negate (Two's Complement) 
 NEG R16 
 Rd = 0x00 - Rd 
 
 
 
 D. Logic Instructions 
 
 
 
 Mnemonic 
 Operand 
 Description 
 Example 
 Notes 
 
 
 
 
 AND 
 Rd, Rr 
 Logical AND 
 AND R1, R2 
 Rd = Rd AND Rr 
 
 
 ANDI 
 Rd, K 
 AND Immediate 
 ANDI R16, 0x0F 
 Rd = Rd AND K (masking) 
 
 
 OR 
 Rd, Rr 
 Logical OR 
 OR R1, R2 
 Rd = Rd OR Rr 
 
 
 ORI 
 Rd, K 
 OR Immediate 
 ORI R16, 0x80 
 Rd = Rd OR K 
 
 
 EOR 
 Rd, Rr 
 Exclusive OR 
 EOR R16, R17 
 Rd = Rd XOR Rr 
 
 
 COM 
 Rd 
 One's Complement 
 COM R16 
 Rd = 0xFF - Rd (inverts all bits) 
 
 
 CLR 
 Rd 
 Clear Register 
 CLR R16 
 Rd = 0 (same as EOR Rd, Rd) 
 
 
 SER 
 Rd 
 Set Register 
 SER R16 
 Rd = 0xFF (R16-R31 only) 
 
 
 
 E. Shift & Rotate Instructions 
 
 
 
 Mnemonic 
 Operand 
 Description 
 Example 
 Notes 
 
 
 
 
 LSL 
 Rd 
 Logical Shift Left 
 LSL R16 
 Shift left, bit 0 = 0, bit 7 → Carry 
 
 
 LSR 
 Rd 
 Logical Shift Right 
 LSR R16 
 Shift right, bit 7 = 0, bit 0 → Carry 
 
 
 ROL 
 Rd 
 Rotate Left through Carry 
 ROL R16 
 Rotate left through Carry flag 
 
 
 ROR 
 Rd 
 Rotate Right through Carry 
 ROR R16 
 Rotate right through Carry flag 
 
 
 ASR 
 Rd 
 Arithmetic Shift Right 
 ASR R16 
 Shift right, bit 7 remains (preserve sign) 
 
 
 SWAP 
 Rd 
 Swap Nibbles 
 SWAP R16 
 Swaps upper and lower 4-bits in register 
 
 
 
 F. Branch & Control Flow Instructions 
 
 
 
 Mnemonic 
 Operand 
 Description 
 Example 
 Notes 
 
 
 
 
 RJMP 
 k 
 Relative Jump 
 RJMP loop 
 Jump to label k (±2K words) 
 
 
 JMP 
 k 
 Jump 
 JMP far_label 
 Jump to 22-bit address (all memory) 
 
 
 RCALL 
 k 
 Relative Call 
 RCALL delay 
 Call subroutine relative to PC 
 
 
 CALL 
 k 
 Call 
 CALL far_sub 
 Call subroutine at 22-bit address 
 
 
 RET 
 - 
 Return 
 RET 
 Return from subroutine 
 
 
 RETI 
 - 
 Return from Interrupt 
 RETI 
 Return from interrupt handler 
 
 
 CP 
 Rd, Rr 
 Compare 
 CP R16, R17 
 Compare Rd with Rr (updates flags) 
 
 
 CPI 
 Rd, K 
 Compare Immediate 
 CPI R16, 5 
 Compare Rd with constant K 
 
 
 CPC 
 Rd, Rr 
 Compare with Carry 
 CPC R17, R19 
 For multi-byte comparison 
 
 
 BREQ 
 k 
 Branch if Equal 
 BREQ target 
 Jump if Z flag = 1 (result equal) 
 
 
 BRNE 
 k 
 Branch if Not Equal 
 BRNE loop 
 Jump if Z flag = 0 (result not equal) 
 
 
 BRLO 
 k 
 Branch if Lower 
 BRLO less 
 Jump if C flag = 1 (unsigned <) 
 
 
 BRSH 
 k 
 Branch if Same or Higher 
 BRSH greater 
 Jump if C flag = 0 (unsigned ≥) 
 
 
 BRLT 
 k 
 Branch if Less Than 
 BRLT neg 
 Jump if S flag = 1 (signed <) 
 
 
 BRGE 
 k 
 Branch if Greater or Equal 
 BRGE pos 
 Jump if S flag = 0 (signed ≥) 
 
 
 
 G. Skip Instructions 
 
 
 
 Mnemonic 
 Operand 
 Description 
 Example 
 Notes 
 
 
 
 
 SBIS 
 A, b 
 Skip if Bit in I/O Set 
 SBIS PINB, 0 
 Skip next instruction if bit = 1 
 
 
 SBIC 
 A, b 
 Skip if Bit in I/O Cleared 
 SBIC PIND, 2 
 Skip next instruction if bit = 0 
 
 
 SBRS 
 Rr, b 
 Skip if Bit in Register Set 
 SBRS R16, 7 
 Skip if bit b in register = 1 
 
 
 SBRC 
 Rr, b 
 Skip if Bit in Register Cleared 
 SBRC R16, 0 
 Skip if bit b in register = 0 
 
 
 
 H. Other Instructions 
 
 
 
 Mnemonic 
 Operand 
 Description 
 Example 
 Notes 
 
 
 
 
 NOP 
 - 
 No Operation 
 NOP 
 Does nothing (1 clock cycle) 
 
 
 SLEEP 
 - 
 Sleep 
 SLEEP 
 Enters sleep mode (power saving) 
 
 
 WDR 
 - 
 Watchdog Reset 
 WDR 
 Resets watchdog timer 
 
 
 SBIW 
 Rd, K 
 Subtract Immediate from Word 
 SBIW R24, 1 
 Subtract K from 16-bit value (R25:R24) 
 
 
 ADIW 
 Rd, K 
 Add Immediate to Word 
 ADIW R24, 1 
 Add K to 16-bit value

6. Status Register (SREG)
The Status Register contains flags that indicate the results of arithmetic/logic operations. This register is crucial for branch instructions. 
 
 
 
 Bit 
 Name 
 Description 
 
 
 
 
 7 
 I (Global Interrupt Enable) 
 Enables/disables global interrupts 
 
 
 6 
 T (Bit Copy Storage) 
 Storage for BLD/BST instructions 
 
 
 5 
 H (Half Carry Flag) 
 Carry from bit 3 to bit 4 (for BCD) 
 
 
 4 
 S (Sign Flag) 
 S = N ⊕ V (for signed operations) 
 
 
 3 
 V (Overflow Flag) 
 Two's complement overflow 
 
 
 2 
 N (Negative Flag) 
 Result is negative (bit 7 = 1) 
 
 
 1 
 Z (Zero Flag) 
 Result = 0 
 
 
 0 
 C (Carry Flag) 
 Carry/borrow from operation

7. Delay Implementation Without Library
Delays can be created using nested loops that consume a certain number of clock cycles. 
 Delay Calculation Concept: 
 
 ATmega328P on Arduino Uno runs at 16 MHz (16 million clock cycles per second) 
 1 millisecond = 16,000 clock cycles 
 DEC instruction takes 1 cycle, BRNE takes 2 cycles (if branch taken) 
 
 Delay Implementation Examples: 
 ; Delay approximately 1 second (with nested loop)
delay_1s:
 LDI R18, 64 ; Outer counter
outer_loop:
 LDI R24, lo8(62500) ; Inner counter low byte
 LDI R25, hi8(62500) ; Inner counter high byte
inner_loop:
 SBIW R24, 1 ; Subtract 16-bit counter (2 cycles)
 BRNE inner_loop ; Loop if not 0 (2 cycles if taken)
 DEC R18 ; Subtract outer counter
 BRNE outer_loop ; Loop outer if not 0
 RET

; Simple delay with single loop
delay_simple:
 LDI R16, 255 ; Load counter
delay_loop:
 DEC R16 ; Decrement counter (1 cycle)
 BRNE delay_loop ; Branch if not zero (2 cycles)
 RET ; Return (approximately 765 cycles total)

8. Complete Program Examples
A. Blink LED 
 #define __SFR_OFFSET 0x00
#include "avr/io.h"

.global main

main:
 SBI DDRB, 5 ; Set PB5 (Pin 13) as Output

loop:
 SBI PORTB, 5 ; Turn on LED (Output HIGH)
 RCALL delay ; Call delay subroutine
 CBI PORTB, 5 ; Turn off LED (Output LOW)
 RCALL delay ; Call delay subroutine
 RJMP loop ; Repeat continuously

delay:
 LDI R18, 82 ; Outer loop counter
outer:
 LDI R24, lo8(60000) ; Inner loop counter (low byte)
 LDI R25, hi8(60000) ; Inner loop counter (high byte)
inner:
 SBIW R24, 1 ; Subtract word (R25:R24)
 BRNE inner ; Loop if not 0
 DEC R18 ; Subtract outer counter
 BRNE outer ; Loop outer if not 0
 RET ; Return to caller
 
 B. Reading Button and Controlling LED 
 #define __SFR_OFFSET 0x00
#include "avr/io.h"

.global main

main:
 ; Setup
 SBI DDRB, 5 ; PB5 (Pin 13) as Output (LED)
 CBI DDRD, 2 ; PD2 (Pin 2) as Input (Button)
 SBI PORTD, 2 ; Activate Pull-up on PD2

loop:
 SBIC PIND, 2 ; Skip next instruction if button pressed (LOW)
 RJMP led_off ; If not pressed, turn off LED
 
led_on:
 SBI PORTB, 5 ; Turn on LED
 RJMP loop ; Return to loop

led_off:
 CBI PORTB, 5 ; Turn off LED
 RJMP loop ; Return to loop
 
 C. Toggle LED with Button (Simple Debounce) 
 #define __SFR_OFFSET 0x00
#include "avr/io.h"

.global main

main:
 ; Initialization
 SBI DDRB, 5 ; PB5 as Output (LED)
 CBI DDRD, 2 ; PD2 as Input (Button)
 SBI PORTD, 2 ; Activate internal Pull-up
 CLR R20 ; R20 = LED status (0 = off)

wait_press:
 SBIC PIND, 2 ; Wait for button pressed (LOW)
 RJMP wait_press
 
 ; Button pressed - toggle LED
 SBRC R20, 0 ; Skip if bit 0 of R20 = 0 (LED off)
 RJMP turn_off
 
turn_on:
 SBI PORTB, 5 ; Turn on LED
 LDI R20, 1 ; Set status = on
 RJMP debounce

turn_off:
 CBI PORTB, 5 ; Turn off LED
 CLR R20 ; Set status = off

debounce:
 RCALL delay ; Delay for debounce
 
wait_release:
 SBIS PIND, 2 ; Wait for button released (HIGH)
 RJMP wait_release
 RCALL delay ; Delay debounce after release
 RJMP wait_press ; Return to wait for press

delay:
 LDI R18, 50
d_outer:
 LDI R24, lo8(10000)
 LDI R25, hi8(10000)
d_inner:
 SBIW R24, 1
 BRNE d_inner
 DEC R18
 BRNE d_outer
 RET