文章详情

  • 游戏榜单
  • 软件榜单
关闭导航
热搜榜
热门下载
热门标签
php爱好者> php文档>Zero-copy user-space access

Zero-copy user-space access

时间:2006-05-13  来源:rwen2012


Zero-copy user-space access

[Posted April 14, 2003 by corbet]
This article is part of the LWN Porting Drivers to 2.6 series.
The kiobuf abstraction was introduced in 2.3 as a low-level way of representing I/O buffers. Its primary use, perhaps, was to represent zero-copy I/O operations going directly to or from user space. A number of problems were found with the kiobuf interface, however; among other things, it forced large I/O operations to be broken down into small chunks, and it was seen as a heavyweight data structure. So, in 2.5.43, kiobufs were removed from the kernel.

This article looks at how to port drivers which used the kiobuf interface in 2.4. We'll proceed on the assumption that the real feature of interest was direct access to user space; there wasn't much motivation to use a kiobuf otherwise.

Zero-copy block I/O

The 2.6 kernel has a well-developed direct I/O capability for block devices. So, in general, it will not be necessary for block driver writers to do anything to implement direct I/O themselves. It all "just works."

Should you have a need to perform zero-copy block operations, it's worth noting the presence of a useful helper function:

 struct bio *bio_map_user(struct block_device *bdev, 
unsigned long uaddr,
unsigned int len,
int write_to_vm);

This function will return a BIO describing a direct operation to the given block device bdev. The parameters uaddr and len describe the user-space buffer to be transferred; callers must check the returned BIO, however, since the area actually mapped might be smaller than what was requested. The write_to_vm flag is set if the operation will change memory - if it is a read-from-disk operation. The returned BIO (which can be NULL - check it) is ready for submission to the appropriate device driver.

When the operation is complete, undo the mapping with:

 void bio_unmap_user(struct bio *bio, int write_to_vm);

Mapping user-space pages

If you have a char driver which needs direct user-space access (a high-performance streaming tape driver, say), then you'll want to map user-space pages yourself. The modern equivalent of map_user_kiobuf() is a function called get_user_pages():
 int get_user_pages(struct task_struct *task, 
struct mm_struct *mm,
unsigned long start,
int len,
int write,
int force,
struct page **pages,
struct vm_area_struct **vmas);

task is the process performing the mapping; the primary purpose of this argument is to say who gets charged for page faults incurred while mapping the pages. This parameter is almost always passed as current. The memory management structure for the user's address space is passed in the mm parameter; it is usually current->mm. Note that get_user_pages() expects that the caller will have a read lock on mm->mmap_sem. The start and len parameters describe the user-buffer to be mapped; len is in pages. If the memory will be written to, write should be non-zero. The force flag forces read or write access, even if the current page protection would otherwise not allow that access. The pages array (which should be big enough to hold len entries) will be filled with pointers to the page structures for the user pages. If vmas is non-NULL, it will be filled with a pointer to the vm_area_struct structure containing each page.

The return value is the number of pages actually mapped, or a negative error code if something goes wrong. Assuming things worked, the user pages will be present (and locked) in memory, and can be accessed by way of the struct page pointers. Be aware, of course, that some or all of the pages could be in high memory.

There is no equivalent put_user_pages() function, so callers of get_user_pages() must perform the cleanup themselves. There are two things that need to be done: marking of modified pages, and releasing them from the page cache. If your device modified the user pages, the virtual memory subsystem may not know about it, and may fail to write the pages to permanent storage (or swap). That, of course, could lead to data corruption and grumpy users. The way to avoid this problem is to call:

 SetPageDirty(struct page *page);

for each page in the mapping. Current (2.6.3) kernel code checks to ensure that pages are not reserved first with code like:

 if (!PageReserved(page))
SetPageDirty(page);

But pages mapped from user space should not, normally, be marked reserved in the first place.

Finally, every mapped page must be released from the page cache, or it will stay there forever; simply pass each page structure to:

 void page_cache_release(struct page *page);

After you have released the page, of course, you should not access it again.

For a good example of how to use get_user_pages() in a char driver, see the definition of sgl_map_user_pages() in drivers/scsi/st.c.

(Log in to post comments)
Driver porting: Zero-copy user-space access
Posted Feb 13, 2004 14:34 UTC (Fri) by guest grisu1976 [Link]

I don't really understand why the kiobuf interface does not exist anymore. In linux kernel 2.4 the kiobuf interface used get_user_pages, or am i wrong? The kiobuf interface was easier to use than get_user_pages - that's my opinion

Driver porting: Zero-copy user-space access
Posted Mar 3, 2004 7:55 UTC (Wed) by subscriber bhepple [Link]

Hmmm, a quick recursive grep through the 2.6.3 driver source and include files showed exactly 0 users of set_page_dirty_lock() and 1 user of put_page() (in drivers/char/agp/generic.c)

There _is_ a
#define page_cache_release(page) put_page(page)
in include/linux/pagemap.h and it is quite a popular little chap in the device driver code with 13 hits in the entire tree.

Am I missing something or should we be using page_cache_release instead of put_page and is it (and set_page_dirty_lock) _really_ needed after all - I can hardly believe all those drivers are causing "data corruption and grumpy users"...

Driver porting: Zero-copy user-space access
Posted Nov 2, 2005 21:29 UTC (Wed) by subscriber rwbowman [Link]

TRUE or FALSE - if I map and lock a user space buffer using get_user_pages during ioctl, and then allow the ioctl to return, the pages will remnain locked until I release them with page_cache_release.

I've heard (twice now) where folks say they don't remain locked.?
Thanks!

Driver porting: Zero-copy user-space access
Posted Feb 15, 2006 17:17 UTC (Wed) by guest ceb [Link] Does anyone have any experience in using get_user_pages in a real system. As far as I can tell it is totally unreliable when the machine is in any way loaded. As well as getting zero addresses for pages returned, the physical addresses don't seem to correspond to the data to be transferred. This is on various flavors of Linux 2.6 based kernels.

This seems to rule out performing DMA directly from user space but I would like to be told that I'm wrong.
相关阅读 更多 +
排行榜 更多 +
末世生存射击

末世生存射击

飞行射击 下载
大理石大师

大理石大师

飞行射击 下载
枪神手旋转射击最新版

枪神手旋转射击最新版

飞行射击 下载