.\" $Id: 2.t,v 1.1 2001/07/04 05:29:25 itojun Exp $ .\" .\".ds RH KAME approach .NH 1 KAME approach .PP This section describes the approaches we at the KAME project took against the problems mentioned in the previous section. We introduce a new function called .I m_pulldown, in place of .I m_pullup, for adjusting payload data in the mbuf. We also change the calling sequence for the protocol input function. .NH 2 What is the KAME project? .PP In the early days of IPv6/IPsec development, the Japanese research community felt it very important to make a reference code available in a freely-redistributable form for educational, research and deployment purposes. The KAME project is a consortium of 7 Japanese companies and an academic research group. The project aims to deliver IPv6/IPsec reference implementation for 4.4BSD, under BSD license. The KAME project intends to deliver the most spec-conformant IPv6/IPsec implementation possible. .NH 2 m_pulldown function .PP Here we introduce a new function, .I m_pulldown, to address the 3 problems with .I m_pullup that we have described above. The actual source code is included at the end of this paper. The function prototype is as follows: .DS .SM \f[CR]struct mbuf * m_pulldown(m, off, len, offp) struct mbuf *m; int off, len; int *offp;\fP .NL .DE .I m_pulldown will ensure that the data region in the mbuf chain, starting at .I off and ending at .I "off + len", is put into a continuous memory region. .I len must be smaller than, or equal to, MCLBYTES (2048 bytes). The function returns a pointer to an intermediate mbuf in the chain (we refer to the pointer as \fIn\fP), and puts the new offset in .I n to .I *offp. If .I offp is NULL, the resulting region can be located by .I "mtod(n, caddr_t)"; if .I offp is non-null, it will be located at .I "mtod(n, caddr_t) + *offp". The mbuf prior to .I off will remain untouched, so it is safe to keep the pointers to the mbuf chain. For example, consider the mbuf chain .nr figure +1 on Figure \n[figure] .nr figure -1 as the input. .KF .PS define pointer { box ht boxht*1/4 } define payload { box } IP: [ IPp: pointer IPd: payload with .n at bottom of IPp "mbuf1" "50 bytes" ] move TCP: [ TCPp: pointer TCPd: payload with .n at bottom of TCPp "mbuf2" "20 bytes" ] arrow from IP.IPp.center to TCP.TCPp.center .PE .ce .nr figure +1 Figure \n[figure]: mbuf chain before the call to \fIm_pulldown\fP .KE If we call .I m_pulldown with .I "off = 40", .I "len = 10", and a non-null .I offp, the mbuf chain will remain unchanged. The return value will be a pointer to mbuf1, and .I *offp will be filled with 40. If we call .I m_pulldown with .I "off = 40", .I "len = 20", and null .I offp, then the mbuf chain will be modified as shown .nr figure +1 in Figure \n[figure], .nr figure -1 by allocating a new mbuf, mbuf3, into the middle and moving data from both mbuf1 and mbuf2. The function returns a pointer to mbuf3. .KF .PS define pointer { box ht boxht*1/4 } define payload { box } IP: [ IPp: pointer IPd: payload with .n at bottom of IPp "mbuf1" "40 bytes" ] move 0.2; INT: [ INTp: pointer INTd: payload with .n at bottom of INTp "mbuf3" "20 bytes" ] move 0.2; TCP: [ TCPp: pointer TCPd: payload with .n at bottom of TCPp "mbuf2'" "10 bytes" ] arrow from IP.IPp.center to INT.INTp.center arrow from INT.INTp.center to TCP.TCPp.center .PE .ce .nr figure +1 Figure \n[figure]: mbuf chain after call to \fIm_pulldown\fP, with \fIoff = 40\fP and \fIlen = 20\fP .KE The .I m_pulldown function solves all 3 problems in .I m_pullup that were described in the previous section. .I m_pulldown does not copy mbufs when copying is not necessary. Since it does not modify the mbuf chain prior to the speficied offset .I off, it is not necessary for the caller to re-initialize the pointers into the mbuf data region. With .I m_pullup, we always needed to specify the data payload length, starting from the very first byte in the packet. With .I m_pulldown, we pass .I off as the offset to the data payload we are interested in. This change avoids extra data manipulation when we are only interested in the intermediate data portion of the packet. It also eases the assumption regarding total packet header length. While .I m_pullup assumes that the total packet header length is smaller than or equal to MHLEN (100 bytes), .I m_pulldown assumes that single packet header length is smaller than or equal to MCLBYTES (2048 bytes). With mbuf framework this is the best we can do, since there is no way to hold continuous region longer than MCLBYTES in a standard mbuf chain. .NH 2 New function prototype for inbound packet processing .PP For IPv6 processing, our code does not make a deep function call chain. Rather, we make a loop in the very last part of .I ip6_input, as shown in Figure 8. IPPROTO_DONE is a pseudo-protocol type value that identifies the end of the extension header chain. If more protocol headers exist, each header processing code will update the pointer variables and return the next extension header type. If the final header in the chain has been reached, IPPROTO_DONE is returned. .\" figure 8 .nr figure +1 With this code, we no longer have a deep call chain for IPv6/IPsec processing. Rather, .I ip6_input will make calls to each extension header processor directly. This avoids the possibility of overflowing the kernel stack due to multiple extension header processing. .KF .PS A: ellipse "\fIip6_input\fP" right move move up move B: ellipse "\fIrthdr6_input\fP" move to last ellipse .s down C: ellipse "\fIah_input\fP" D: ellipse "\fIesp_input\fP" E: ellipse "\fItcp_input\fP" arrow from 1/4 to 1/4 arrow from 1/4 to 1/4 arrow from 1/4 to 1/4 arrow from 1/4 to 1/4 arrow from 1/4 to 1/4 arrow from 1/4 to 1/4 arrow from 3/8 to 1/4 arrow from 3/8 to 1/4 .PE .ce .nr figure +1 Figure \n[figure]: KAME avoids function call chain by making a loop in \fIip6_input\fP .KE .PP Regardless of the calling sequence imposed by the .I pr_input function prototype, it is important not to use up the kernel stack region in protocol handlers. Sometimes it is necessary to decrease the size of kernel stack usage by using pointer variables and dynamically allocated regions. .1C .KF .DS .ps 8 .vs 9 \f[CR]struct ip6protosw { int (*pr_input) __P((struct mbuf **, int *, int)); /* and other members */ }; ip6_input(m) struct mbuf *m; { /* in the very last part */ extern struct ip6protosw inet6sw[]; /* the first one in extension header chain */ nxt = ip6.ip6_nxt; while (nxt != IPPROTO_DONE) nxt = (*inet6sw[ip6_protox[nxt]].pr_input)(&m, &off, nxt); } /* in each header processing code */ int foohdr_input(mp, offp, proto) struct mbuf **mp; int *offp; int proto; { /* some processing, may modify mbuf chain */ if (we have more header to go) { *mp = newm; *offp = nxtoff; return nxt; } else { m_freem(newm); return IPPROTO_DONE; } }\fP .DE .NL .ce Figure 8: KAME IPv6 header chain processing code. .KE .if t .2C