

#### **SNS COLLEGE OF TECHNOLOGY**

Coimbatore-35 An Autonomous Institution



Accredited by NBA – AICTE and Accredited by NAAC – UGC with 'A++' Grade

# DEPARTMENT OF ELECTRONICS & COMMUNICATION ENGINEERING

#### **19ECT312 – EMBEDDED SYSTEM DESIGN**

III YEAR/ VI SEMESTER

**UNIT 4 : EMBEDDED OPERATING SYSTEM AND MODELING** 

**TOPIC 4.4 : MEMORY MANAGEMENT** 





#### Associative memory







#### **Associative Memory**

Associative memory – parallel search

| Page # | Frame # |
|--------|---------|
|        |         |
|        |         |
|        |         |
|        |         |

- Address translation (p, d)
  - If p is in associative register, get frame # out
  - Otherwise get frame # from page table in memory

# Implementation of Page Tab

- For each process, Page table is kept in main memory
- Page-table base register (PTBR) points to the page table
- Page-table length register (PTLR) indicates size of the page table
- In this scheme every data/instruction access requires two memory accesses
  - One for the page table and one for the data / instruction
- The two memory access problem can be solved by the use of a special fast-lookup hardware cache called associative memory or translation look-aside buffers (TLBs)
- TLBs typically small (64 to 1,024 entries)
- On a TLB miss, value is loaded into the TLB for faster access next time
  - Replacement policies must be considered (LRU)
  - Some entries can be wired down for permanent fast access
- Some TLBs store address-space identifiers (ASIDs) in each TLB entry uniquely identifies each process (PID) to provide address-space protection for that process
  - Otherwise need to flush at every context switch



# Paging Hardware With TLB









- Associative Lookup =  $\varepsilon$  time unit
  - Can be < 10% of memory access time</li>

Hit ratio =  $\alpha$ 

 Hit ratio – percentage of times that a page number is found in the associative registers; ratio related to size of TLB

Consider  $\alpha$  = 80%,  $\epsilon$  = 20ns for TLB search, 100ns for memory access

Effective Access Time (EAT) EAT =  $(100 + \varepsilon) \alpha + (200 + \varepsilon)(1 - \alpha)$ 

Consider  $\alpha$  = 80%,  $\epsilon$  = 20ns for TLB search, 100ns for memory access - EAT = 0.80 x 120 + 0.20 x 220 = 140ns

Consider better hit ratio ->  $\alpha$  = 98%,  $\epsilon$  = 20ns for TLB search, 100ns for memory access

- EAT = 0.98 x 120 + 0.02 x 220 = 122ns



## **Memory Protection**



- Memory protection implemented by associating protection bit with each frame to indicate if read-only or read-write access is allowed
  - Can also add more bits to indicate page execute-only, and so on
- Valid-invalid bit attached to each entry in the page table:
  - "valid" indicates that the associated page is in the process' logical address space, and is thus a legal page
  - "invalid" indicates that the page is not in the process' logical address space
  - Or use PTLR
- Any violations result in a trap to the kernel

#### Valid (v) or Invalid (i) Bit In A Page Table



#### 14 bit address space (0 to 16383) Page size 2KB 0 Process P1 uses only 0 to 10468 1 2 page 0 P1 e 0 00000 frame number valid-invalid bit 3 page 1 page 0 e 1 0 2 V e 2 4 page 2 page 1 3 1 v 2 4 V e 3 5 page 2 3 7 V 8 page 3 4 V 6 5 9 V 7 page 3 page 4 6 0 7 0 î. 10,468 page 5 8 page 4 page table 712,287 9 page 5 nal fragmentation Use of PTLR (length) page n



### Shared Pages Example



- System with 40 users
  - Use common text editor
- Text editor contains 150KB code 50KB data (page size 50KB)
  - 8000KB!
- Shared code
  - One copy of read-only (reentrant) code shared among processes (i.e., text editors, compilers, window systems)
    - Code never changes during execution
- Only one copy of the editor in the memory
- Total memory consumption
  - 40\*50+150=2150KB



#### **Shared Pages Example**







#### Data share: example







# Structure of the Page Table

- Memory requirement for page table can get huge using straight-forward methods
  - Consider a 32-bit logical address space as on modern computers
  - Page size of 4 KB (2<sup>12</sup>)
  - Page table would have 1 million entries 2<sup>20</sup> (2<sup>32</sup> / 2<sup>12</sup>)
  - If each entry is 4 bytes -> 4 MB of physical address space / memory for page table alone
    - That amount of memory used to cost a lot
    - Don't want to allocate that contiguously in main memory

- Hierarchical Paging
- Hashed Page Tables
- Inverted Page Tables





 Break up the page table into multiple pages

• We then page the page table

A simple technique is a two-level page table





### Two-Level Paging Example



Iogical address (on 32-bit machine with 4KB page size) is divided into:

- a page number consisting of 20 bits
- a page offset consisting of 12 bits

Since the page table is paged, the page number is further divided into:

- a 10-bit page number
- a 10-bit page offset

Thus, a logical address is as follows:

| p | bage nur | nber                  | page offset |
|---|----------|-----------------------|-------------|
|   | $p_1$    | <i>p</i> <sub>2</sub> | d           |
|   | 10       | 10                    | 12          |

where  $p_1$  is an index into the outer page table, and  $p_2$  is the displacement within the page of the inner page table





5/8/







Even two-level paging scheme not sufficient If page size is 4 KB (2<sup>12</sup>)

- Then page table has 2<sup>52</sup> entries
- If two level scheme, inner page tables could be 2<sup>10</sup> 4-byte entries
- Address would look like

|       | ir    | nner pag              | е           |
|-------|-------|-----------------------|-------------|
| outer |       |                       | page offset |
|       | $p_1$ | <i>p</i> <sub>2</sub> | d           |
|       | 42    | 10                    | 12          |

- Outer page table has 2<sup>42</sup> entries or 2<sup>44</sup> bytes
- One solution is to add a 2<sup>nd</sup> outer page table
- But in the following example the 2<sup>nd</sup> outer page table is still 2<sup>34</sup> bytes in size
  - And possibly 4 memory access to get to one physical memory location





| outer page | inner page            | offset |
|------------|-----------------------|--------|
| $p_1$      | <i>p</i> <sub>2</sub> | d      |
| 42         | 10                    | 12     |

| 2nd outer page | outer page | inner page            | offset |
|----------------|------------|-----------------------|--------|
| $p_1$          | $p_2$      | <i>p</i> <sub>3</sub> | d      |
| 32             | 10         | 10                    | 12     |

ARC (32 bits), Motorola 68030 support three and four level paging respectively



#### Hashed Page Tables



Common in virtual address spaces > 32 bits

The page number is hashed into a page table

This page table contains a chain of elements hashing to the same location

Each element contains (1) the page number (2) the value of the mapped page frame (3) a pointer to the next element

Virtual page numbers are compared in this chain searching for a match

- If a match is found, the corresponding physical frame is extracted





#### Hashed Page Table





#### **Inverted Page Table**



- Rather than each process having a page table and keeping track of all possible logical pages, track all frames
- One entry for each frame
- Entry consists the page number stored in that frame, with information about the process that owns that page
- Decreases memory needed to store each page table,
  - but increases time needed to search the table when a page reference occurs





### Segmentation



- Memory-management scheme that supports user view of memory
- A program is a collection of segments – A segment is a logical unit such as: Compiler generates the main program segments procedure Loader assign the seg# function method object local variables, global variables common block stack symbol table arrays









User specifies each address by two quantities (a) Segment name

(b) Segment offset

Logical address contains the tuple <segment#, offset>

- Variable size segments without order
- Length=> purpose of the program
- Elements are identified by offset



Dr.B.Sivasankari/Professor/ECE/SNSCT





| Windows XP Memory Usage |               |              |                              |  |  |  |
|-------------------------|---------------|--------------|------------------------------|--|--|--|
| Segment                 | First Address | Last Address | Size                         |  |  |  |
| Code                    | 401000x       | 403000x      | 002000x<br>~ 8 Kbytes        |  |  |  |
| Static (Global)<br>Data | 403000x       | 703000x      | 300000x<br>~ 3 megabytes     |  |  |  |
| Heap                    | 760000x       | 3A261000x    | 39800000x<br>~ 950 megabytes |  |  |  |
| Stack                   | 22EF00x       | 16EF00x      | 1C0000x<br>~ 2 megabyte      |  |  |  |





#### LINUX Memory Usage

| Segment                 | First Address | Last Address | Size                         |
|-------------------------|---------------|--------------|------------------------------|
| Code                    | 8048400x      | 8049900x     | 001500x<br>~ 6 Kbytes        |
| Static (Global)<br>Data | 8049A00x      | 8349A00      | 300000x<br>~ 3 megabytes     |
| Heap                    | B7EE,B000x    | 01CE,4000x   | B6000000x<br>~ 3 gigabytes   |
| Stack                   | BFFB,7334x    | 29BA,91E0x   | 9640,0000x<br>~ 2.5 gigabyte |

#### Memory image



| Per S      |                      |       |        |       |      |        | -                             |          | INSTITUTIONS                                                              |
|------------|----------------------|-------|--------|-------|------|--------|-------------------------------|----------|---------------------------------------------------------------------------|
| A08048368  | <main+0>:</main+0>   | 55    |        |       |      | push   | %ebp                          |          |                                                                           |
| 0x08048369 | <main+l>:</main+l>   | 89 e5 |        |       |      | mov    | %esp,%ebp                     |          |                                                                           |
| 0x0804836b | <main+3>:</main+3>   | 83 ec | 08     |       |      | sub    | \$0x8,%esp                    |          |                                                                           |
| 0x0804836e | <main+6>:</main+6>   | 83 e4 | £O     |       |      | and    | <pre>\$0xfffffff0,%esp</pre>  | 1        | void b():                                                                 |
| 0x08048371 | <main+9>:</main+9>   | b8 00 | 00 00  | 00 (  |      | mov    | \$0x0,%eax                    | 2        | void c();                                                                 |
| 0x08048376 | <main+14>:</main+14> | 83 c0 | 0f     |       |      | add    | \$0xf,%eax                    | 3        | int main()                                                                |
| 0x08048379 | <main+17>:</main+17> | 83 c0 | 0£     |       |      | add    | \$0xf,%eax                    | 5        | printf( "Hello from main'ın");                                            |
| 0x0804837c | <main+20>:</main+20> | c1 e8 | 04     |       |      | shr    | \$0x4,%eax                    | 6        | b();                                                                      |
| 0x0804837f | <main+23>:</main+23> | c1 e0 | 04     |       |      | shl    | \$0x4,%eax                    | 8        | // This routine reads the opcodes from memory and prints them out.        |
| 0x08048382 | <main+26>:</main+26> | 29 c4 |        |       |      | sub    | <pre>%eax, %esp</pre>         | 9<br>10  | void b()                                                                  |
| 0x08048384 | <main+28>:</main+28> | 83 ec | 0c     |       |      | sub    | \$0xc,%esp                    | 11       | char *moving;                                                             |
| 0x08048387 | <main+31>:</main+31> | 68 c0 | 84 0   | 80    |      | push   | \$0x80484c0                   | 12<br>13 | for ( moving = (char *)(&main); moving < (char *)(&c); moving++ )         |
| 0x0804838c | <main+36>:</main+36> | e8 1f | ff f:  | E ff  |      | call   | 0x80482b0                     | 14       | printi( "Addr = 0x%x, Value = %2x\n", (int)(moving), 255 & (int)*moving ) |
| 0x08048391 | <main+41>:</main+41> | 83 c4 | 10     |       |      | add    | \$0x10,%esp                   | 15<br>16 | ]<br>void c()                                                             |
| 0x08048394 | <main+44>:</main+44> | e8 02 | 00 00  | 00    |      | call   | 0x804839b <b></b>             | 17<br>18 | (<br>]                                                                    |
| 0x080483   | 9b <b+0>:</b+0>      | 55    |        |       |      | push   | %ebp                          |          |                                                                           |
| 0x080483   | 9c <b+1>:</b+1>      | 89 e  | 5      |       |      | mov    | %esp,%ebp                     |          |                                                                           |
| 0x080483   | 9e <b+3>:</b+3>      | 83 e  | c 08   |       |      | sub    | \$0x8,%esp                    |          |                                                                           |
| 0x080483   | al <b+6>:</b+6>      | c7 4  | 5 fc   | 8 83  | 04 0 | 8 movl | \$0x8048368,0xffff            | ffc(     | %ebp)                                                                     |
| 0x080483   | a8 <b+13>:</b+13>    | 81 7  | d fc ( | 19 83 | 04 0 | 8 cmpl | \$0x80483d9,0xffff            | ffc(     | %ebp)                                                                     |
| 0x080483   | af <b+20>:</b+20>    | 73 2  | 6      |       |      | jae    | 0x80483d7 <b+60></b+60>       |          | 55-1-19 ( <b>#</b> -200)                                                  |
| 0x080483   | b1 <b+22>:</b+22>    | 83 e  | c 04   |       |      | sub    | \$0x4,%esp                    |          |                                                                           |
| 0x080483   | b4 <b+25>:</b+25>    | 8b 4  | 5 fc   |       |      | mov    | <pre>0xfffffffc(%ebp),%</pre> | eax      |                                                                           |
|            |                      |       |        |       |      |        |                               |          |                                                                           |

\$0xff,%eax

movsbl (%eax),%eax

and

0x080483b7 <b+28>:

0x080483ba <b+31>:

0f be 00

25 ff 00 00 00



|         |                  |               | Symbol t           | able    |                 |
|---------|------------------|---------------|--------------------|---------|-----------------|
|         |                  | $\mathbf{i}$  | Name               | address |                 |
| <b></b> |                  |               | SQR                | 0       |                 |
| out     |                  |               | SUM                | 4       | Virtual address |
|         |                  | $\mathcal{I}$ |                    |         | space           |
|         | Paging view      |               |                    |         |                 |
| 0       | Load             | 0             |                    |         |                 |
| U       | LOUG             |               |                    |         |                 |
| 4       | ADD              | 4             |                    |         |                 |
|         |                  |               | 1                  |         |                 |
|         | ADD<br>Segmentat |               | /<br><st,0></st,0> |         |                 |



#### Segmentation Architecture



- Logical address consists of a two tuple: <segment-number, offset>
- Segment table maps two-dimensional logical address to physical address;
- Each table entry has:
  - base contains the starting physical address where the segments reside in memory
  - limit specifies the length of the segment
- Segment-table base register (STBR) points to the segment table's location in memory
- Segment-table length register (STLR) indicates number of segments used by a program; segment number s is legal if s < STLR</li>

#### **Example of Segmentation**









### Segmentation Hardware



Dr.B.Sivasankari/Professor/ECE/SNSCT

#### **Example of Segmentation**







#### Segmentation Architecture



- Protection
- Protection bits associated with segments
  - With each entry in segment table associate:
    - validation bit =  $0 \Rightarrow$  illegal segment
    - read/write/execute privileges
- Code sharing occurs at segment level
- Since segments vary in length, memory allocation is a dynamic storage-allocation problem
  - Long term scheduler
  - First fit, best fit etc
- Fragmentation





#### Key idea:

Segments are splitted into multiple pages

# Each page is loaded into frames in the memory



### Segmentation with Paging

- Supports segmentation with paging
  - Each segment can be 4 GB
  - Up to 16 K segments per process
  - <selector(16), offset (32)>
  - Divided into two partitions

S(13) G(1) P(2)

Intel 80386

IBM OS/2

- First partition of up to 8 K segments are private to process (kept in local descriptor table LDT)
- Second partition of up to 8K segments shared among all processes (kept in global descriptor table GDT)
- CPU generates logical address (six Segment Reg.)
  - Given to segmentation unit
    - Which produces linear addresses
  - Physical address 32 bits
  - Linear address given to paging unit
    - Which generates physical address in main memory
    - Paging units form equivalent of MMU
    - Pages sizes can be 4 KB











# Intel Pentium Segmentation



