Driver porting: supporting mmap()
时间:2006-05-13 来源:rwen2012
Driver porting: supporting mmap()
[Posted April 14, 2003 by corbet]This article is part of the LWN Porting Drivers to 2.6 series. |
Using remap_page_range()
There are two techniques in use for implementing mmap(); often the simpler of the two is using remap_page_range(). This function creates a set of page table entries covering a given physical address range. The prototype of remap_page_range() changed slightly in 2.5.3; the relevant virtual memory area (VMA) pointer must be passed as the first parameter:int remap_page_range(struct vm_area_struct *vma, unsigned long from,
unsigned long to, unsigned long size,
pgprot_t prot);
remap_page_range() is now explicitly documented as requiring that the memory management semaphore (usually current->mm->mmap_sem) be held when the function is called. Drivers will almost invariably call remap_page_range() from their mmap() method, where that semaphore is already held. So, in other words, driver writers do not normally need to worry about acquiring mmap_sem themselves. If you use remap_page_range() from somewhere other than your mmap() method, however, do be sure you have acquired the semaphore first.
Note that, if you are remapping into I/O space, you may want to use:
int io_remap_page_range(struct vm_area_struct *vma, unsigned long from,
unsigned long to, unsigned long size,
pgprot_t prot);
On all architectures other than SPARC, io_remap_page_range() is just another name for remap_page_range(). On SPARC systems, however, io_remap_page_range() uses the systems I/O mapping hardware to provide access to I/O memory.
remap_page_range() retains its longstanding limitation: it cannot be used to remap most system RAM. Thus, it works well for I/O memory areas, but not for internal buffers. For that case, it is necessary to define a nopage() method. (Yes, if you are curious, the "mark pages reserved" hack still works as a way of getting around this limitation, but its use is strongly discouraged).
Using vm_operations
The other way of implementing mmap is to override the default VMA operations to set up a driver-specific nopage() method. That method will be called to deal with page faults in the mapped area; it is expected to return a struct page pointer to satisfy the fault. The nopage() approach is flexible, but it cannot be used to remap I/O regions; only memory represented in the system memory map can be mapped in this way.The nopage() method made it through the entire 2.5 development series without changes, only to be modified in the 2.6.1 release. The prototype for that function used to be:
struct page *(*nopage)(struct vm_area_struct *area,
unsigned long address,
int unused);
As of 2.6.1, the unused argument is no longer unused, and the prototype has changed to:
struct page *(*nopage)(struct vm_area_struct *area,
unsigned long address,
int *type);
The type argument is now used to return the type of the page fault; VM_FAULT_MINOR would indicate a minor fault - one where the page was in memory, and all that was needed was a page table fixup. A return of VM_FAULT_MAJOR would, instead, indicate that the page had to be fetched from disk. Driver code using nopage() to implement a device mapping would probably return VM_FAULT_MINOR. In-tree code checks whether type is NULL before assigning the fault type; other users would be well advised to do the same.
There are a couple of other things worth mentioning. One is that the vm_operations_struct is rather smaller than it was in 2.4.0; the protect(), swapout(), sync(), unmap(), and wppage() methods have all gone away (they were actually deleted in 2.4.2). Device drivers made little use of these methods, and should not be affected by their removal.
There is also one new vm_operations_struct method:
int (*populate)(struct vm_area_struct *area, unsigned long address,
unsigned long len, pgprot_t prot, unsigned long pgoff,
int nonblock);
The populate() method was added in 2.5.46; its purpose is to "prefault" pages within a VMA. A device driver could certainly implement this method by simply invoking its nopage() method for each page within the given range, then using:
int install_page(struct mm_struct *mm, struct vm_area_struct *vma,
unsigned long addr, struct page *page,
pgprot_t prot);
to create the page table entries. In practice, however, there is no real advantage to doing things in this way. No driver in the mainline (2.5.67) kernel tree implements the populate() method.
Finally, one use of nopage() is to allow a user process to map a kernel buffer which was created with vmalloc(). In the past, a driver had to walk through the page tables to find a struct page corresponding to a vmalloc() address. As of 2.5.5 (and 2.4.19), however, all that is needed is a call to:
struct page *vmalloc_to_page(void *address);
This call is not a variant of vmalloc() - it allocates no memory. It simply returns a pointer to the struct page associated with an address obtained from vmalloc().
(Log in to post comments)
|
|
|
Driver porting: supporting mmap() |
I test this on another machine and it succeeded. It seems the 2.6.0 kernrl didn't work normally on the original machine.
But I encounter another question now.
I reserve a 4M memory at boot time. In my driver I support mmap method by calling remap_page_range() function with (TOTAL_MEMORY_SIZE - 4M) as the physical memory start address parameter.
In the driver I access the memory block with the pointer returned by ioremap(TOTAL_MEMORY_SIZE - 4M, 4M). It seems work. In App I write some data into the pointer returned by mmap, and dump the memory block in the driver, I find the output isn't the same as I wrote at App side. What's wrong with the driver?
BTW, my driver supports read/write methods, too. In them I use copy_to_user/copy_from_user. After write in App, the driver dumps the memory block, the output is correct.
How can I support mmap correctly?