§ A job’s size is no longer restricted to the size of main memory (or the free space within main memory).
§ Memory is used more efficiently because the only sections of a job stored in memory are those needed immediately while those not needed remain in secondary storage.
§ It allows unlimited amounts of multiprogramming.
§ It eliminates external fragmentation and minimizes internal fragmentation by combined segmentation and paging.
§ It allows for sharing of code and data
These far outweigh the disadvantages:
§ Increased hardware costs
§ Increased overheads for handling paging interrupts.
§ Increased software complexity to prevent thrashing.
EECS 678
Introduction to Operating Systems
Home Work 3
(Total of 200 Points)
Answer Key
Question 1 (Chap 8; 50 Points)
Several short questions relating to the basic ideas of address space management and use.
1. Discuss the differences between logical and physical addresses and address spaces as they relate to an executing program.
spaces.
o An address generated by the CPU as it executes a program is a logical address because it refers to to a location within the logical address space used by the program currently running on the CPU. An address used to reference the main memory is a physical address because it refers to a specific location in the physical memory. The entire set of physical memory available on the machine is the physcial address space of that machine. The set of all logical pages used by a program is referred to as its logical address space; the set of all physical pages supporting these logical pages are referred to as the physical address space pages supporting the logical address space of the program.
2. Explain, compare, and contrast the following resource allocation policies:
o First-fit
§ Locate the first item from the resource pool that is big enough to satisfy the request for a resource, then stop searching. Searching for an item can start either at the beginning of the set of free items or where the previous first-fit search ended. First-fit decreases the average search time, but may not provide the best storage utilization. It may also exhibit larger fragmentation than Best-fit. Whether that is internal or external fragmentation depends on how the first-fit block is used to fulfill the request. If only the requested space is allocated, the most obvious choice, then external fragmentation will generally be increased. If not, then internal fragmentation is increased. Remember also, that this is average case behavior; behavior of any allocation scheme can look good or bad for a specific sequence of operations.
o Worst-fit
§ Allocate the largest item in the free pool, as long as it is big enough. This means that the allocation routine must search the entire list, unless it is sorted by size, which introduces another kind of overhead. Worst-fit produces the largest leftover item, making it easier to satisfy future requests, but it will generally take more time than first fit to find that worst-fit item because the whole list must be examined to find the worst one. In this case worst means the biggest one, the one that would leave the most left over. Worst-fit tends to quickly fragment the memory, so that only relatively small pieces are left, making it impossible to satify large requests. However, its intent is to leave the largest possible pieces, to mitigate fragmentation. First fit is a popular choice since it is faster, and worst-fit has no clear advantages.
o Best-fit
§ Allocate the item in the free pool that is closest to the requested size, but still greater than or equal to the requested size. This means that the allocation routine must search the entire list, unless it is sorted by size. Best-fit produces the smallest left over item, making it harder to satisfy future requests with the remaining piece, but it is attempting to waste the least memory. However, finding the best fit piece requires searching the whole list, if it is sorted by address, which imposes overhead. Best-fit tends to fragment the memory in the sense of creating small free peices, but it leaves larger pieces for use when needed. First fit is a popular choice since it is faster, and best-fit is not a clear winner.
3. Compare and contrast internal and external fragmentation and give examples of a resource pool situation where each occurs.
o External fragmentation
§ As memory blocks are allocated and deallocated from a section of memory, the free memory space is usually not a single big block, but several smaller pieces. These tend to be scattered in the memory space holding the resource pool, as larger free blocks are used to satisfy requests for smaller blocks, and some allocated blocks are held for a long time, while others are held for a shorter time. The total free space required to satisfy a given request may thus exist, but if it is not contiguous, then it cannot be used, and free memory is said to be is externally fragmented.
o Internal fragmentation
§ When the allocated memory block is larger than the requested memory size, the unused portion is the "internal fragment". The difference between the requested and actually allocated sizes is the magnitude of internal fragmentation. This means that the fragmented memory is internal to an allocated block, but is not being used. Internal fragmentation usually occurs because other aspects of the system require allocation of a larger piece than requested. An example of this is the physical page pool, a fixed size item pool, where only whole pages can be allocated. On average this means wasting half (internal fragmentation) of the last page required to support a given section of a program, or any other allocated segment of memory that is page based.
4. Assuming a page size of 2K and a 32-bit machine, how many bits are used for the page number, and how many are used for the offset within the page? Give two reasons why page sizes are usually powers of two.
o 21 bits for the page number, 11 bits for the offset within the page.
o Three reasons for page sizes being powers of 2 (you needed two):
1. It simplifies the design of hardware because translation of a logical address into a logical and then physical page number and page offset requires only masking off sections of an address represented as a binary number. So the hardware can mask and shift rather than having to add, subtract and divide.
2. It makes the hardware both cheaper and faster.
3. Page sizes should be multiples (usually 1) of disk block size, and they are usually powers of 2 as well.
5. Describe a mechanism by which one or more pages could belong to the address space of two different processes.
o Using page based memory allocation and program segmentation gives us the possibility of sharing common code. To be shareable, the code must be read-only, so the code never changes during execution. Thus, two or more processes can execute the same code at the same time. The logical pages holding the code in each program image will be mapped by the page tables for both the processes to the same set of physical pages. The same physical pages will thus be used by two different logical address spaces, and thus two different processes. The data pages for the two processes will be different, of course, since they must be free to behave independently. For example, two processes running the same editor use the same code, but have different data pages since they can be editing different files. The data pages within the logical address space will thus be mapped to different physical pages. The stack pages of two processes sharing code are also different since they are independent threads of control and thus must have independent execution contexts. A given set of data page can be shared among two or more processes, of course, but then all sharing processes can change the data and all the familiar issues of concurrency control apply.
Question 2 (Chap 8; 50 Points)
Consider a system with memory mapping done on a page basis, and using a single level page table. Assume that the necessary page table is always in memory.
- (15) If a memory reference takes 100 nanoseconds, how long does a memory mapped reference take if there is no TLB within the MMU to cache recently used page addresses?
- 200 nanoseconds: 100 to get the page table entry, and 100 to access the memory location.
- (20) Now we add an MMU which imposes an overhead of 15 nanoseconds on a hit or a miss. If we assume that 90 percent of all memory references hit in the MMU TLB, what is the Effective Memory Access time?
- There are two cases. In the first case, the TLB contains the required entry. Then we pay the 15 ns overhead on top of the 100 ns memory access time. In the second case, the TLB does not contain the item. Then we pay an additional 100 ns to get the required entry into the TLB, making 215 ns altogether. Hence, the EMAT is:
- (15) Explain how the TLB hit rate affects the EMAT.
- The higher the TLB hit rate is, the smaller the EMAT is, because the additional 100 ns penalty to get the page table entry into the TLB on a miss contributes less to the EMAT. It is important to note, however, that while any increase in the TLB hit rate decreases the EMAT, the magnitude of change in the EMAT is not constant. The EMAT asymtotically approaches the physical memory speed as the TLB hit rate approaches 100\%. Thus, at some point it makes more sense to stop spending money to increase the TLB hit rate, and to spend it instead on faster physical memory.
Question 3 (Chap 9; 40 Points)
Several short questions relating to the basic ideas of virtual memory management and use.
1. Under what circumstances do page faults occur? Describe the actions taken by the operating system when a page fault occurs.
o Page faults happen when a process tries to use a valid logical page that is not presently mapped into physical memory. It causes a trap to the operating system. Actions taken by the OS are:
1. Save the state of the interrupted process when the page fault occurs. Reset the hardware pipelines, and transfer control to the trap (page fault) handler.
2. Check the page table for this process to determine whether the reference was a valid or invalid logical address reference. If it is invalid, then terminate the process (page fault - core dumped). Otherwise arrange to read the contents of the referenced logical page into a designated physical page.
3. Schedule a disk operation to read the desired page into memory. Take the current process from the running state to the blocked state.
4. When the disk read is complete, the system modifies the page table to indicate that the page is now in memory. Change the process' state from blocked to ready.
5. Restart the process as though the page had always been in memory, by restarting the instruction which caused the trap (page fault).
2. Assume that you have a page-reference string for a process with M frames (initially all empty). The page-reference string has length P; N distinct page numbers occur in it. Answer these questions for any page replacement algorithm:
o What is a lower bound on the number of page faults?
§ Lower Bound: N. The reference string indicates that the program actually references N distinct pages, and the best that any page replacement algorithm can do is make the first page fault for a page the last.
o What is an upper bound on the number of page faults?
§ Upper Bound: P. The worst possible situation is one where the working set is so small, and the page replacement algorithm so inept, that every reference to a new page in the reference string is a page fault.
3. What is the cause of thrashing?
o A process is thrashing if it is spending "too much" of its time paging rather than executing the program. How much is too much is clearly a debatable point, but it is also clear that the time scale change with disk access on a page falt can slow a process drastically if it happens at all frequently. When the degree of multiprogramming increases, CPU utilization also generally increases. However, if the degree of multiprogramming increases beyond the maximum which can be supported by the system, there will not be enough pages in each process's working set, which increases the probability that any given memory reference will cause a page fault. This means that thrashing will begin, and CPU utilization will drop sharply. The cause of thrashing is thus, that the working set of one or more processes is too low, increasing page fault frequency to a wasteful level.
Question 4 (Chap 9; 30 Points)
Consider a two dimensional array A
int A [][] = new int [100][100];Assume for the sake of argument that A[0][0] is stored at location 200 in a page memory ssytem with a page size of 200. The text segment of a small process resides in page 0 (locations 0-199) for manipulating the matrix A, and every instruction fetch will thus be from page 0.
Assuming a working set of 3 page frames, how many page faults will be generated by the following array initialization loops, using LRU replacement? Assume page frame 1 has the program text in it, and the other two are used for data access to the matrix. Ignore the page or pages supporting the stack, and thus containing the variables i and j, for the purposes of this exercise.
a. for (int j =0; j < 100; j++) { for (int i = 0; i < 100; i++) { A[i][j] = 0; } } b. for (int i =0; i < 100; i++) { for (int j = 0; j < 100; j++) { A[i][j] = 0; } }Make sure to read Section 9.9.5, and note that the book's explanation assumes that the contents of the 2D matrix A are in row major order, which means that rows are stored in contiguous main memory locations. Assume that each integer in the matrix occupies a single memory location.
2. The program text will never be the least recently used page, since program instructions are run between any two assignments to the array. Therefore, we can ignore the program and consider only the 2 physical pages for the data. The book assumes that matrices are stored in row-major order. In initialization loop (a), the row index (i) is incremented in the inner loop, while the column index is incremented in the outer loop. As a result, each increment of the inner loop skips half of the page size by skipping to the same column position within the next row. Thus, a new page is referenced every two times around the inner loop. The number of rows is large enough that no data page will still be one of the two in memory when it is touched again, so every second page reference generates a page fault. Since the array A has 10,000 elements, this leads to a total of 5,000 page faults, since every other array reference will visit a data page that is not one of the 2 currently in memory. Add one page fualt for the initial instruction fetch from the text frame if you wish to be obsessively complete and loop (a) gives 5,001 page faults.
3. In initialization loop (b), the page holding the program will also never be paged out, for the same reason as in (a). So, once again, we can assume 2 physical pages for the matrix data. In this case, the inner loop increments j and thus visits every member of a row before visiting the next. Thus, each page is entirely initialized and then never touched again. So there is exactly one page fault per page holding the array. Since the array has 10,000 elements and the page size is 200, there will be 10,000 / 200 = 50 page faults. Again, if you wish to be fanatic, add one for the initial text page reference, and loop (b) will produce 51 page faults.
4. If you really want to consider the stack page, which holds the loop variables i and j, then simply add a page frame for the stack to the set. Then, consider that the loop variables on the stack are accessed frequently, and thus the page frame supporting the stack will also never be the LRU page, and thus the page fault behvior is exactly the same, expect that we should add one to the number of page faults produced by each loop to account for the initial reference to the stack page.
Question 5 (Chap 10; 30 Points)
Consider a demand-paged computer ssytem where the degree of multi-programming is currently fixed at four. The system was recently measured to determine utilization of CPU and the paging disk. The results are one of the following alternatives. For each case, what is happening? Can you increase the degree of multiprogramming to increase the CPU utilization? Is the paging helping to improve performance?
5. CPU utilization, 13 percent; disk utilization, 97 percent. The CPU utilization is low, yet the disk utilization is very high. This indicates that either all the processes are primarily I/O bound jobs, or the system is thrashing. Increasing the degree of multiprogramming will only serve to make thrashing even worse, since the paging is not helping in improving performance. However, increasing multiprogramming with some CPU-bound jobs might help if the high disk utilization is because of I/O-bound jobs. Of course, it is hard to know if a process is CPU-bound or not until its behavior demonstrates the fact. This ilustrates the important point that in this situation we would have to know why the disk utilization was high; application-level I/O or page fault I/O.
6. CPU utilization, 87 percent; disk utilization, 3 percent. The CPU utilization is high, and the disk utilization is low. This indicates that the system is operating normally. The degree of multiprogramming is about right. It might be possible to increase it slightly, but that also runs the risk of increasing it too much, thereby engendering a thrashing situation. Paging is helping to improve performance in this case.
7. CPU utilization, 13 percent; disk utilization, 3 percent. Both CPU and disk utilization are low. The few processes running on the system are receiving all the service they want. However, the low CPU utilization indicates that the degree of multiprogramming could be increased to increase the CPU utilization. Paging is not helping system performance, but it is not hurting it either, for the simple reason that there is almost no paging going on. Simply stated, the system does not have enough to do!