文章详情

  • 游戏榜单
  • 软件榜单
关闭导航
热搜榜
热门下载
热门标签
php爱好者> php文档>Are "skb->data" physically continuous?

Are "skb->data" physically continuous?

时间:2007-01-26  来源:rwen2012

Re: Are "skb->data" physically continuous?

From: Nick Patavalis ([email protected])
Date: Mon Sep 15 2003 - 05:27:03 EST
  • Next message: Shmulik Hen: "Re: Are "skb->data" physically continuous?"
  • Previous message: Jean-Francois Dive: "Re: PPP and virtual lo: routing problem"
  • In reply to: Jamie Lokier: "Re: Are "skb->data" physically continuous?"
  • Next in thread: Shmulik Hen: "Re: Are "skb->data" physically continuous?"
  • Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Sun, Sep 14, 2003 at 02:40:54PM +0100, Jamie Lokier wrote:
> Shmulik Hen wrote:
> > On Sunday 14 September 2003 01:41 am, Nick Patavalis wrote:
> > > this assumption, but I have also heard that "zero-copy" networking
> > > was added to the kernel at some point. Zero-copy indicates that
> > > data come directly for user-space and, hence, they might be
> > > non-continuous.
> >
> > You may want to take a look at e100_main.c in one of the latest 2.4.x
> > kernels. There you should be able to see how to deal with
> > dev->features and the flags NETIF_F_SG for scatter-gather
> > capabilities and NETIF_F_*_CSUM for checksum offloading capabilities.
> > Zero-copy was added in 2.4.4, and is a combination of the above. Also,
> > take a look at skbuff.h for MAX_SKB_FRAGS and struct skb_shared_info
> > and their use in the kernel code.
>
> In case it wasn't clear, if you _don't_ set those NETIF_* flags, then
> your driver is always passed contiguous data.
> performance penalty.
>

Hen and Jamie,

Thanks a lot for your very helpful replies. I took a look at the
places you suggested in order to find out how a driver supporting
scatter-gather should be coded. With the hope that others might find
this useful, I'm sending a rather longish description of what I found
out. I hope that there are not too many misconceptions, or that,
someone will point them out, if there are.

** Features of a Networking Driver / Device.

The "net_device" structure (defined in "include/linux/netdevice.h"),
which is filled-in by a net driver at initialization time, includes a
field called "features". By setting certain bits in this field the
driver can inform the networking stack of it's capabilities. As of
2.4.20 the following features-masks are defined (in
"include/linux/netdevice.h"), and can be declared by the driver:

NETIF_F_SG
Scatter/gather IO.

NETIF_F_IP_CSUM
Can checksum only TCP/UDP over IPv4.

NETIF_F_NO_CSUM
Does not require checksum. F.e. loopack.

NETIF_F_HW_CSUM
Can checksum all the packets.

NETIF_F_DYNALLOC
Self-dectructable device.

NETIF_F_HIGHDMA
Can DMA to high memory.

NETIF_F_FRAGLIST <------------------- ??? WHAT IS THIS ???
Scatter/gather IO.

NETIF_F_HW_VLAN_TX
Transmit VLAN hw acceleration

NETIF_F_HW_VLAN_RX
Receive VLAN hw acceleration

NETIF_F_HW_VLAN_FILTER
Receive filtering on VLAN

NETIF_F_VLAN_CHALLENGED
Device cannot handle VLAN packets

** Scatter-Gather DMA

Among the feature bits, shown above, the "NETIF_F_SG" is the one the
driver sets to indicate that it can do scatter-gather DMA. If
"NETIF_F_SG" is not set, then the networking stack will make sure that
the "skb"s hold *physically-continuous* data before passing them to
the driver. This is taken care of in "net/core/dev.c:dev_queue_xmit()"
like this:

if (skb_shinfo(skb)->frag_list &&
!(dev->features&NETIF_F_FRAGLIST) &&
skb_linearize(skb, GFP_ATOMIC) != 0) {
kfree_skb(skb);
return -ENOMEM;
}

/* Fragmented skb is linearized if device does not support SG,
* or if at least one of fragments is in highmem and device
* does not support DMA from it.
*/
if (skb_shinfo(skb)->nr_frags &&
(!(dev->features&NETIF_F_SG) || illegal_highdma(dev, skb)) &&
skb_linearize(skb, GFP_ATOMIC) != 0) {
kfree_skb(skb);
return -ENOMEM;
}

As a result, when a driver's "hard_start_xmit()" function receives an
skb, it knows that the data to be transmitted start at "skb->data",
that their length is "skb->len", and that they are virtually and
physically continuous. As a result the driver can directly pass the
"skb->data" pointer to the device's DMA controller, after converting
it to a physical address, and synchronizing the relevant cache entries
(by calling something like "pci_map_single()").

If---on the other hand---the driver sets the "NETIF_F_SG" bit in the
"features" field of the "net_device" structure (declaring that it
*can* do scatter-gather DMA), then any skb passed to it, might
very-well hold data that are not physically continuous (and sometimes
not even virtually continuous). In this case for every "skb" passed to
the driver the networking stack also fills-in a "skb_shared_info"
structure, defined in "include/linux/skbuff.h", like this:

struct skb_shared_info {
atomic_t dataref;
unsigned int nr_frags;
struct sk_buff *frag_list;
skb_frag_t frags[MAX_SKB_FRAGS];
};

This structure is pointed-to by the "end" field of the "sk_buff"
structure, so it can be accessed by the driver as:

(struct skb_shared_info *)skb->end

or even better using the macro "skb_shinfo", which is essentially the
same:

skb_shinfo(skb)

It should by obvious that, in the scatter-gather case, the frame to be
transmitted consist of a sequence of fragments (parts), each of which
keeps a virtually and physically continuous subset of the data. The
start of the first fragment is pointed by "skb->data" (as in the
non-SG case), but its length (in bytes) is "skb->len - skb->data_len"
(wich can also be accessed using the macto "skb_headlen()" defined in
"include/linux/skbuff.h"). "skb->len" is still the length of the
*full* frame (the sum of the lengths of all the fragments), and
"skb->data_len" is the total length of all the data fragments not
counting the first "header" fragment pointed by "skb->data". Actually
a way to check if an skb is physically-continuous is to test if
"skb->data_len" is non-zero; there is even a macro for this
("skb_is_nonlinear()") defined in "include/linux/skbuff.h". After the
initial "header" fragment, there are exactly
"skb_shinfo(skb)->nr_frags" fragments following. Each of these
fragments is described by a "skb_frag_t" structure defined (in
"include/linux/skbuff.h") as:

struct skb_frag_struct
{
struct page *page;
__u16 page_offset;
__u16 size;
};

...

typedef struct skb_frag_struct skb_frag_t;

The "skb_frag_struct" structure corresponding to the I'th fragment can
be accessed as:

skb_shinfo(skb)->frags[I];

So the data of a non-linear "sk_buff" "skb" consist of the following
parts, which are themselves linear (virtually and physically
continuous):

addr of addr of
part # : first byte last byte
-----------------------------------------
0 : skb->data ... skb->len - skb->data_len - 1
1 : fr_adr(0) ... fr_adr(0) + fr_sz(0) - 1
.
.
nfrags : fr_adr(nfrags - 1)
... fr_adr(nfrags - 1) + fr_sz(nfrags - 1) - 1

where:

"nfrags" is "skb_shinfo(skb)->nr_frags"

and

"fr_adr(i)" is "fr_pg_adr(i) + fr_pg_ofs(i)"
"fr_pg_adr(i)" is "page_address(skb_shinfo(skb)->frags[i].page)"
"fr_pg_ofs(i)" is "skb_shinfo(skb)->frags[i].page_offset"
"fr_sz(i)" is "skb_shinfo(skb)->frags[i].size"

NOTICE: "fr_adr", "fr_sz", "fr_pg_adr", and "fr_pg_ofs" are just
symbolisms introduced to convenience out discussion, they are not
actually defined as macros in the kernel. "page_address", on the
other hand, is a real macro defined in "include/linux/mm.h"

it also holds that:

skb->data_len == fr_sz(0) + ... + fr_sz(nfrags - 1)

For an example of how these are used in a real driver see
"e100_main.c", and especially the function "e100_prepare_xmit_buff()"
which contains all the details of handling the fragment-sequence.

/npat

--
But the delight and pride of Aule is in the deed of making, and in the
thing made, and neither in possession nor in his own mastery;
wherefore he gives and hoards not, and is free from care, passing ever
on to some new work."
-- J.R.R. Tolkien, Ainulindale (Silmarillion)
相关阅读 更多 +
排行榜 更多 +
大武道最新版

大武道最新版

休闲益智 下载
宝宝巴士手机版(babybus)

宝宝巴士手机版(babybus)

休闲益智 下载
宝宝巴士快乐启蒙游戏

宝宝巴士快乐启蒙游戏

休闲益智 下载