Tips: 2006-02

2006-02-22

浅谈"watchdog timeout"出现的原因

淺談"watchdog timeout"出現的原因

[閱讀 329 次]

xie_minix

最近有比較多的人談到網卡的」watchdog timeout「問題，究竟是什麼原因造成的，大多數人都把網卡的性能不佳做為問題的根源所在。我認為網卡的性能只是一方面的因素，他還涉及到緩衝的大小、單位時間內的包的數量、及網卡驅動程序等一系列因素。以下將從源代碼的角度來對他進行分析。

首先，我們看看到底是哪個函數發出了「watchdog timeout」字符串，只要你查一下源代碼不難看出，在各網卡的驅動程序裡的XX_watchdog（XX是各網卡的名稱，如：8139是rl, AMD7990是pcn,Inter是fxp等等）函數發出的。函數比較簡單：

static void rl_watchdog(ifp)
struct ifnet 　*ifp;

/*申明ifp是一個ifnet結構，結構存放了該網卡的輸入輸
出的函數指針和一些重要參數，當然也包括rl_watchdog函數的
指針*/

{
struct rl_softc 　*sc;
sc = ifp->if_softc;
/*ifnet是softc結構的一個子集，softc包含了更多的該網卡的參數*/

printf("rl%d: watchdog timeout\n", sc->rl_unit);
/*打印出是哪塊網卡出現問題。
sc->rl_unit代表該種網卡的第幾快。我們知道在一個機器裡同樣
的網卡可能有幾塊，當然此參數是由網卡驅動程序的初始化程序
填充*/

ifp->if_oerrors++;
/*累計輸出出現的錯誤包數量（o代表輸出）*/

rl_txeof(sc);
/*這裡是每個驅動程序不同的，此處為8139的，不過我覺得用rl_stop(sc)更好*/

rl_rxeof(sc);
/*這裡也是每個驅動程序不同。我覺得來一個rl_reset(sc)也不錯。*/

rl_init(sc);/*這裡大家都一樣，重新初始芯片。*/

return;
}

好了，到這我們知道是XX_watchdog函數發出了watchdog timeout信息。那麼是誰來調用該函數呢？我們接著來看另一個函數：if_slowtimo函數，該函數在if.c中。if.c是interface(接口的簡稱)，即系統在啟動過程中初始化時必須調用其中的一些函數。ifinit函數是其中被調用的一個，這個函數很簡單：

void
ifinit()
{
static struct timeout if_slowtim;

timeout_set(&if_slowtim, if_slowtimo, &if_slowtim);
/*簡單的說就是設置一定時器*/

if_slowtimo(&if_slowtim);
/*哈哈，等不急了，先調用了再說*/

}

這樣一來，if_slowtimo就成了一個一定時間內就要執行的一個函數了，此時候，大家也知道了if_slowtimo 的大概功能，無非是定時查看各網卡的發送數據的情況，如果沒發送完成，就給該卡加一個計數器，到計數器達到一定的值時還沒發送出去就調用該卡的 XX_watchdog函數。下面我們來看看if_slowtimo函數。

void

if_slowtimo(arg)

void *arg;

{

struct timeout *to = (struct timeout *)arg;

struct ifnet *ifp;

int s = splimp();
/*在做以下操作的時候必須關中斷*/

TAILQ_FOREACH(ifp, &ifnet, if_list)
{/*搜索每一個接口設備，TAILQ_FOREACH實際上是for(...)*/

　if (ifp->if_timer == 0 || --ifp->if_timer)

　 continue;
/*如果是if_timer為0或if_timer減1以後還為真，實際上是對每塊網卡
的計數器if_timer減1後判斷他是否還大於0，小於0就調用watchdog
函數。*/

　if (ifp->if_watchdog) /*不過調用之前看看該卡有沒有watchdog函數*/

　 (*ifp->if_watchdog)(ifp);

}

splx(s);

timeout_add(to, hz /
IFNET_SLOWHZ);
/*每次計時器完成後都會清除，你不得不又加上去。hz是計算
機的主頻，就是說調度的間隔時間是和主頻成正比的關係。*/

}

到這裡一切都很明白了，我們只要在驅動程序的輸出包時給定一個值，而輸出函數是一個直到輸出成功才跳出的循環，不成功他就一直重試來輸出此包，而我們上面的程序就會時間一到就給你的值減1，如果減到小於0了，就watchdog timeout。我們還是來看一看程序吧：

static void rl_start(ifp)

struct ifnet 　*ifp;

{

struct rl_softc 　*sc;

struct mbuf 　*m_head = NULL;

sc = ifp->if_softc;

while(RL_CUR_TXMBUF(sc) == NULL) {/* 1：當輸出緩衝區為空時才進行新的輸出*/

　IF_DEQUEUE(&ifp->if_snd, m_head);/*把將要輸出的數據加入到輸出隊列中。*/

　if (m_head == NULL)/* 2：申請內存失敗*/

　 break;

　if (rl_encap(sc, m_head))
{/*8139卡的弱智表現在此，多加一個頭部，還要長字節對齊，影響

　　　到數據必須重新搬遷。花時間啊！*/

　 IF_PREPEND(&ifp->if_snd, m_head);

　 ifp->if_flags |= IFF_OACTIVE;

　 break;

　}

　if (ifp->if_bpf)/* 3：如果包過濾存在就進行過濾*/

　 bpf_mtap(ifp, RL_CUR_TXMBUF(sc));

　CSR_WRITE_4(sc, RL_CUR_TXADDR(sc),/* 4：以下為硬件輸出的IO指令*/

　　　vtophys(mtod(RL_CUR_TXMBUF(sc), caddr_t)));

　CSR_WRITE_4(sc, RL_CUR_TXSTAT(sc),

　　　RL_TXTHRESH(sc->rl_txthresh) |

　　　RL_CUR_TXMBUF(sc)->m_pkthdr.len);

　RL_INC(sc->rl_cdata.cur_tx);

}

if (RL_CUR_TXMBUF(sc) !=
NULL)/*如果傳送緩衝不為空，說明數據放到緩衝中已經準備傳了*/

　ifp->if_flags |= IFF_OACTIVE;/*加上正在傳標誌*/

ifp->if_timer = 5;/*設定計數器為5*/

return;

}

static void rl_intr(arg)

{

... 　這中間我就不寫了

if ((status & RL_ISR_TX_OK) || (status & RL_ISR_TX_ERR))
　/*如果中斷後狀態寄存器的標識是成功或出錯，就調用下面的程序。*/

　rl_txeof(sc);

...

}

再看rl_txeof:

static void rl_txeof(sc)

{

...

ifp->if_timer =
0;
/*哈哈，在這清0了，也就是說，只要你不是反覆在那傳，
不管傳輸錯誤和傳輸正確都不會出現"watchdog timeout"*/

...

}

綜上所述：引起watchdog timeout的主要原因為：1、緩衝區不夠大，前面的沒發完後面的又跟的來了。2、內核的內存分配出現問題，此情況比較少發生。3、卡的質量（在IO時的吞吐量）。如何解決些問題：
首先我們必須查出導致出現該問題的原因，即是這問題中的哪個引起的，我們來修改if.h中定義一全局變量：

u_int8_t myerror; /*意思是出錯的原因代碼，按我列的來吧，1是緩衝區不夠...*/

在函數static void rl_start(ifp)中加入：

static void rl_start(ifp)

struct ifnet *ifp;

{

struct rl_softc *sc;

struct mbuf *m_head = NULL;

sc = ifp->if_softc;

u_int8_t tmperror;

if (RL_CUR_TXMBUF(sc) != NULL) {/*新加，如果是緩衝區不夠問題*/

myerror=1;

}

while(RL_CUR_TXMBUF(sc) == NULL) {

IF_DEQUEUE(&ifp->if_snd, m_head);

if (m_head == NULL)

{

myerror=2; /*內存分配出錯*/

break;

}

if (rl_encap(sc, m_head)) {

IF_PREPEND(&ifp->if_snd, m_head);

ifp->if_flags |= IFF_OACTIVE;

break;

}

if (ifp->if_bpf)

bpf_mtap(ifp, RL_CUR_TXMBUF(sc));

tmperror=myerror;/*在進行寫IO口前先保存前面出錯的原因*/

CSR_WRITE_4(sc, RL_CUR_TXADDR(sc),

vtophys(mtod(RL_CUR_TXMBUF(sc), caddr_t)));

CSR_WRITE_4(sc, RL_CUR_TXSTAT(sc),

RL_TXTHRESH(sc->rl_txthresh) |

RL_CUR_TXMBUF(sc)->m_pkthdr.len);

myerror=tmperror;/*上面兩句沒問題的話再還原前面的出錯原因*/

RL_INC(sc->rl_cdata.cur_tx);

}

if (RL_CUR_TXMBUF(sc) != NULL)

ifp->if_flags |= IFF_OACTIVE;

ifp->if_timer = 5;

return;

}

最後再改一下rl_watchdog中的顯示部分

printf("rl%d: watchdog timeout:error number is %x\n", sc->rl_unit,myerror);

當然這只是我個人的見解，可能有許多不足或沒考慮到的地方，也希望大家能提出更好、更容易的方法。

2006-02-15

FreeBSD Networking over FireWire --- fwe

Monday, January 30, 2006

FreeBSD Networking over FireWire

You might be familiar with Apple's implementation of IP over FireWire. This allows connecting two computers directly over FireWire ports. FreeBSD offers two drivers that provide networking over FireWire. fwe is a non-standard protocol, but it is implemented by default in the GENERIC kernel. fwip implements RFC 2734 (IPv4 over IEEE 1394) and RFC 3146 (Transmission of IPv6 Packets over IEEE 1394 Networks); it is available via kernel module.

I decided to have my laptop orr talk to my server janney using FireWire. To implement FireWire, orr uses an Adaptec DuoConnect PC Card Adapter and janney uses an Adaptec DuoConnect PCI Adapter. Both provide FireWire and USB 2.0.

Each system is running FreeBSD 6.0.

The laptop dmesg sees the following when the FireWire adapter is inserted.


cardbus0: CIS pointer is 0!
cardbus0: Resource not specified in CIS: id=10, size=800
cardbus0: Resource not specified in CIS: id=14, size=4000
fwohci0:  mem 0x88004000-0x880047ff,0x88000000-0x88
003fff irq 11 at device 0.0 on cardbus0
fwohci0: OHCI version 1.10 (ROM=1)
fwohci0: No. of Isochronous channels is 4.
fwohci0: EUI64 08:00:28:56:02:00:49:8a
fwohci0: Phy 1394a available S400, 3 ports.
fwohci0: Link S400, max_rec 2048 bytes.
firewire0:  on fwohci0
fwe0:  on firewire0
if_fwe0: Fake Ethernet address: 0a:00:28:00:49:8a
fwe0: Ethernet address: 0a:00:28:00:49:8a
fwe0: if_start running deferred for Giant
sbp0:  on firewire0
fwohci0: Initiate bus reset
fwohci0: node_id=0xc000ffc0, gen=1, CYCLEMASTER mode
firewire0: 1 nodes, maxhop <= 0, cable IRM = 0 (me)
firewire0: bus manager 0 (me)
cardbus0: CIS pointer is 0!
cardbus0: Resource not specified in CIS: id=10, size=1000
ohci0:  mem 0x88000000-0x88000fff irq 11 at device
0.4 on cardbus0
ohci0: [GIANT-LOCKED]
usb1: OHCI version 0.0
usb1: unsupported OHCI revision
ohci0: USB init failed

The server dmesg sees the following at boot.


ohci0:  mem 0xf7fff000-0xf7ffffff irq 22 at device
8.0 on pci4
ohci0: [GIANT-LOCKED]
usb1: OHCI version 1.0
usb1:  on ohci0
usb1: USB revision 1.0
uhub1: NEC OHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub1: 3 ports with 3 removable, self powered
ohci1:  mem 0xf7ffe000-0xf7ffefff irq 21 at device
8.1 on pci4
ohci1: [GIANT-LOCKED]
usb2: OHCI version 1.0
usb2:  on ohci1
usb2: USB revision 1.0
uhub2: NEC OHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub2: 2 ports with 2 removable, self powered
ehci0:  mem 0xf7ffdc00-0xf7ffdcff irq 20 at d
evice 8.2 on pci4
ehci0: [GIANT-LOCKED]
usb3: EHCI version 0.95
usb3: companion controllers, 3 ports each: usb1 usb2
usb3:  on ehci0
usb3: USB revision 2.0
uhub3: NEC EHCI root hub, class 9/0, rev 2.00/1.00, addr 1
uhub3: 5 ports with 5 removable, self powered
fwohci0:  mem 0xf7ffd000-0xf7ffd7ff,0xf7ff8000-0xf7
ffbfff irq 22 at device 12.0 on pci4
fwohci0: OHCI version 1.10 (ROM=1)
fwohci0: No. of Isochronous channels is 4.
fwohci0: EUI64 00:50:42:b5:c0:0d:0d:af
fwohci0: Phy 1394a available S400, 3 ports.
fwohci0: Link S400, max_rec 2048 bytes.
firewire0:  on fwohci0
fwe0:  on firewire0
if_fwe0: Fake Ethernet address: 02:50:42:0d:0d:af
fwe0: Ethernet address: 02:50:42:0d:0d:af
fwe0: if_start running deferred for Giant
sbp0:  on firewire0
fwohci0: Initiate bus reset
fwohci0: node_id=0xc800ffc1, gen=1, CYCLEMASTER mode
firewire0: 2 nodes, maxhop <= 1, cable IRM = 1 (me)
firewire0: bus manager 1 (me)

Here is the fwe interface that is created when the FireWire adapter is active on orr.


fwe0: flags=108802 mtu 1500
       options=8
       ether 0a:00:28:00:49:8a
       ch 1 dma -1

Here is the fwe interfaced that is created when the FireWire adapter is active on janney.


fwe0: flags=108943 mtu 1500
       options=8
       inet6 fe80::50:42ff:fe0d:daf%fwe0 prefixlen 64 scopeid 0x2
       inet 10.1.1.2 netmask 0xffffff00 broadcast 10.1.1.255
       ether 02:50:42:0d:0d:af
       ch 1 dma 0

When I plug a FireWire cable into each host, dmesg on orr sees the following.


fwohci0: BUS reset
fwohci0: node_id=0xc800ffc1, gen=2, CYCLEMASTER mode
firewire0: 2 nodes, maxhop <= 1, cable IRM = 1 (me)
firewire0: bus manager 1 (me)
firewire0: New S400 device ID:005042b5c00d0daf

Server janney sees similar messages.


fwohci0: BUS reset
fwohci0: node_id=0x8800ffc0, gen=4, non CYCLEMASTER mode
firewire0: 2 nodes, maxhop <= 1, cable IRM = 1
firewire0: bus manager 1
firewire0: New S400 device ID:080028560200498a

First I assign an IP to fwe0 on orr.


orr:/home/richard$ sudo ifconfig fwe0 inet 10.1.1.1 netmask 255.255.255.0 up
orr:/home/richard$ ifconfig fwe0
fwe0: flags=108943 mtu 1500
       options=8
       inet6 fe80::800:28ff:fe00:498a%fwe0 prefixlen 64 scopeid 0x4
       inet 10.1.1.1 netmask 0xffffff00 broadcast 10.1.1.255
       ether 0a:00:28:00:49:8a
       ch 1 dma 0

Next I assign an IP to fwe0 on janney.


janney:/home/richard$ sudo ifconfig fwe0 inet 10.1.1.2 netmask 255.255.255.0 up
janney:/home/richard$ ifconfig fwe0
fwe0: flags=108943 mtu 1500
       options=8
       inet6 fe80::50:42ff:fe0d:daf%fwe0 prefixlen 64 scopeid 0x2
       inet 10.1.1.2 netmask 0xffffff00 broadcast 10.1.1.255
       ether 02:50:42:0d:0d:af
       ch 1 dma 0

Now the two systems can communicate over FireWire.


janney:/home/richard$ ping 10.1.1.1
PING 10.1.1.1 (10.1.1.1): 56 data bytes
64 bytes from 10.1.1.1: icmp_seq=0 ttl=64 time=0.694 ms
64 bytes from 10.1.1.1: icmp_seq=1 ttl=64 time=0.333 ms
64 bytes from 10.1.1.1: icmp_seq=2 ttl=64 time=0.312 ms
^C
--- 10.1.1.1 ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max/stddev = 0.312/0.446/0.694/0.175 ms

One can sniff the fwe0 interface with Tcpdump.


07:36:42.479420 IP6 fe80::50:42ff:fe0d:daf > ff02::1:ff0d:daf: HBH ICMP6, multicast listener
reportmax resp delay: 0 addr: ff02::1:ff0d:daf, length 24
07:36:42.479667 IP6 fe80::50:42ff:fe0d:daf > ff02::2:f2f0:bec6: HBH ICMP6, multicast listener
reportmax resp delay: 0 addr: ff02::2:f2f0:bec6, length 24
07:36:42.479818 arp who-has 10.1.1.2 tell 10.1.1.2
07:36:42.498496 IP6 :: > ff02::1:ff0d:daf: ICMP6, neighbor solicitation, who has
fe80::50:42ff:fe0d:daf, length 24
07:36:43.756817 IP6 fe80::50:42ff:fe0d:daf > ff02::1:ff0d:daf: HBH ICMP6, multicast listener
reportmax resp delay: 0 addr: ff02::1:ff0d:daf, length 24
07:36:50.962703 IP6 fe80::50:42ff:fe0d:daf > ff02::2:f2f0:bec6: HBH ICMP6, multicast listener
reportmax resp delay: 0 addr: ff02::2:f2f0:bec6, length 24
07:36:54.422718 arp who-has 10.1.1.1 tell 10.1.1.2
07:36:54.422793 arp reply 10.1.1.1 is-at 0a:00:28:00:49:8a
07:36:54.423012 IP 10.1.1.2 > 10.1.1.1: ICMP echo request, id 34306, seq 0, length 64
07:36:54.423050 IP 10.1.1.1 > 10.1.1.2: ICMP echo reply, id 34306, seq 0, length 64

You can see that the fwe driver supports IPv6. The last four packets are IPv4.

Next I decided to try the fwip driver. I loaded the fwip kernel module on each system. First, orr.


orr:/home/richard$ sudo kldload fwip
orr:/home/richard$ kldstat
Id Refs Address    Size     Name
1   12 0xc0400000 63072c   kernel
2    2 0xc0a31000 74b0     snd_csa.ko
3    3 0xc0a39000 1d408    sound.ko
4    1 0xc0a57000 c3a4     r128.ko
5    2 0xc0a64000 eeec     drm.ko
6   16 0xc0a73000 568dc    acpi.ko
7    1 0xc2236000 5000     if_fwip.ko

This created the fwip0 interface, which I gave an IP address.


orr:/home/richard$ ifconfig fwip0
fwip0: flags=108802 mtu 1500
       lladdr 8.0.28.56.2.0.49.8a.a.2.ff.fe.0.0.0.0
orr:/home/richard$ sudo ifconfig fwip0 inet 172.16.1.1 netmask 255.255.255.0 up
orr:/home/richard$ ifconfig fwip0
fwip0: flags=108843 mtu 1500
       inet6 fe80::a00:2856:200:498a%fwip0 prefixlen 64 scopeid 0x5
       inet 172.16.1.1 netmask 0xffffff00 broadcast 172.16.1.255
       lladdr 8.0.28.56.2.0.49.8a.a.2.ff.fe.0.0.0.0

Notice dmesg output for orr.


fwip0:  on firewire0
fwip0: Firewire address: 08:00:28:56:02:00:49:8a @ 0xfffe00000000, S400, maxrec 2048

Now, janney.


janney:/home/richard$ sudo kldload fwip
janney:/home/richard$ kldstat
Id Refs Address    Size     Name
1    5 0xc0400000 641298   kernel
2   16 0xc0a42000 568dc    acpi.ko
3    1 0xc214a000 5000     if_fwip.ko
janney:/home/richard$ ifconfig fwip0
fwip0: flags=108802 mtu 1500
       lladdr 0.50.42.b5.c0.d.d.af.a.2.ff.fe.0.0.0.0
janney:/home/richard$ sudo ifconfig fwip0 inet 172.16.1.2 netmask 255.255.255.0 up
janney:/home/richard$ ifconfig fwip0
fwip0: flags=108843 mtu 1500
       inet6 fe80::250:42b5:c00d:daf%fwip0 prefixlen 64 scopeid 0x6
       inet 172.16.1.2 netmask 0xffffff00 broadcast 172.16.1.255
       lladdr 0.50.42.b5.c0.d.d.af.a.2.ff.fe.0.0.0.0

Notice janney's dmesg.


fwip0:  on firewire0
fwip0: Firewire address: 00:50:42:b5:c0:0d:0d:af @ 0xfffe00000000, S400, maxrec 2048

Now the two systems can communicate using the fwip driver.


janney:/home/richard$ ping -c 1 172.16.1.1
PING 172.16.1.1 (172.16.1.1): 56 data bytes
64 bytes from 172.16.1.1: icmp_seq=0 ttl=64 time=0.733 ms

--- 172.16.1.1 ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max/stddev = 0.733/0.733/0.733/0.000 ms

You can also sniff fwip0. Take a look at the differences.


orr:/home/richard$ tcpdump -n -r fwe0.short.lpc -c 2 icmp
reading from file fwe0.short.lpc, link-type EN10MB (Ethernet)
07:36:54.423012 IP 10.1.1.2 > 10.1.1.1: ICMP echo request, id 34306, seq 0, length 64
07:36:54.423050 IP 10.1.1.1 > 10.1.1.2: ICMP echo reply, id 34306, seq 0, length 64

orr:/home/richard$ tcpdump -n -r fwip0.short.lpc -c 2 icmp
reading from file fwip0.short.lpc, link-type APPLE_IP_OVER_IEEE1394 (Apple IP-over-IEEE 1394)
07:41:52.593940 IP 172.16.1.2 > 172.16.1.1: ICMP echo request, id 38914, seq 0, length 64
07:41:52.593970 IP 172.16.1.1 > 172.16.1.2: ICMP echo reply, id 38914, seq 0, length 64

The first reports "Ethernet" because fwe is an Ethernet emulation layer. The second reports Apple IP-over-IEEE 1394 because that is an entirely new protocol. Check it out in Tethereal.


Frame 8 (102 bytes on wire, 102 bytes captured)
   Arrival Time: Jan 30, 2006 07:41:52.593940000
   Time delta from previous packet: 0.000231000 seconds
   Time since reference or first frame: 13.327087000 seconds
   Frame Number: 8
   Packet Length: 102 bytes
   Capture Length: 102 bytes
   Protocols in frame: ap1394:ip:icmp:data
Apple IP-over-IEEE 1394, Src: 005042B5C00D0DAF, Dst: 80B2FFC100000000
   Destination: 80B2FFC100000000
   Source: 005042B5C00D0DAF
   Type: IP (0x0800)
Internet Protocol, Src: 172.16.1.2 (172.16.1.2), Dst: 172.16.1.1 (172.16.1.1)
   Version: 4
   Header length: 20 bytes
   Differentiated Services Field: 0x00 (DSCP 0x00: Default; ECN: 0x00)
       0000 00.. = Differentiated Services Codepoint: Default (0x00)
       .... ..0. = ECN-Capable Transport (ECT): 0
       .... ...0 = ECN-CE: 0
   Total Length: 84
   Identification: 0x0696 (1686)
   Flags: 0x00
       0... = Reserved bit: Not set
       .0.. = Don't fragment: Not set
       ..0. = More fragments: Not set
   Fragment offset: 0
   Time to live: 64
   Protocol: ICMP (0x01)
   Header checksum: 0x19f0 [correct]
       Good: True
       Bad : False
   Source: 172.16.1.2 (172.16.1.2)
   Destination: 172.16.1.1 (172.16.1.1)
Internet Control Message Protocol
   Type: 8 (Echo (ping) request)
   Code: 0
   Checksum: 0xe88d [correct]
   Identifier: 0x9802
   Sequence number: 0x0000
   Data (56 bytes)

0000  43 de 09 a3 00 09 3e e2 08 09 0a 0b 0c 0d 0e 0f   C.....>.........
0010  10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e 1f   ................
0020  20 21 22 23 24 25 26 27 28 29 2a 2b 2c 2d 2e 2f    !"#$%&'()*+,-./
0030  30 31 32 33 34 35 36 37                           01234567

I noticed each host reported error messages like the following. I believe these were caused by the fwip driver.


fwohci0: txd err= 3 miss Ack err

I believe fwip has a future going forward, since it implements a standard.

For a rough idea of how fast these interfaces were, I transferred the same 100 MB file using the fwe, fwip, and fxp/xl wired Ethernet interfaces. Here are the results.

fwe: 104857600 bytes received in 00:09 (10.16 MB/s)

fwip: 104857600 bytes received in 00:09 (11.01 MB/s)

fxp/xl through a switch: 104857600 bytes received in 00:08 (11.21 MB/s)

fxp/xl via crossover cable: 104857600 bytes received in 00:09 (10.75 MB/s)

All the results are roughly the same, so the bottleneck is probably the hard drive of the laptop.

This opens some interesting possibilities for networking. Maybe I will buy a FireWire hub and connect multiple machines simultaneously.

I hope to try some of the debugging and memory access features of FireWire in the future.

posted by Richard Bejtlich # 07:55

Comments:

To get some idea of peak TCP throughput, use iperf which is in ports.

ISTR tests on a mac using firewire which topped out at around 200 meg/sec.

Greg

# posted by Anonymous : 9:16 AM

Interesting, but I'm struggling with why you'd want to do this? Performance is similar to Ethernet, so why not just use the far more ubiquitous Ethernet?

# posted by Alastair : 7:07 PM

I tried this technology because "it was there," like Everest. I'd like to know if anyone has any production uses?

# posted by Richard Bejtlich : 7:27 PM

I actually used it a few weeks ago, to install FreeBSD over the net on a machine where the onboard NIC wasn't supported by the installer, but the firewire port was. I set up bridging between default NIC and fwe0 on a helper machine, which allowed the new machine to install over the firewire port.

# posted by oyvindmo : 7:53 AM

Oyvindmo -- thanks, that's very cool.

# posted by Richard Bejtlich : 8:33 AM

I have already used firewire as my 2nd NIC on my laptop when using FreeBSD which is always nice to have a choice.

Having in mind that firewire is still under GIANT it would be interesting to know the perfomance numbers you would get testing just the network component like Greg sugested.

# posted by Joao Barros : 10:44 AM

2006-02-09

nvidia onboard ethernet

Originally Posted by Jmdbh

The nve driver for FreeBSD 6.0 Release suffers from this problem. I own myself a board with a NForce4 chipset using onboard Nvidia ethernet.

The existing issues are fixes with 6-stable. However, it is difficult to update the system if you are not connected to a network. But you can checkout the current version from cvsweb.freebsd.org from another computer and copy it over. You need to upgrade files from src/sys/dev/nve/* and src/sys/contrib/dev/nve. Then rebuild your kernel.

2006-02-08

如何在QEMU中通過TUN/TAP使用網絡

After spending perhaps the most frustrating evening of my life attempting to make a Windows 98 guest in qemu talk to the world via TUN in Linux, I thought I'd share my observations with the list and gather some feedback. Firstly, I could not find ~anywhere~ a decent high-level, conceptual outline of what a TUN network 'looks like' in Linux; and that seems to lead to a lot of confusion, both for me personally, and on this list generally (I came across many, many incorrect qemu-ifup scripts etc that just Don't Work on various threads archived on this list and elsewhere). Now that I have made TUN bend to my will, I thought I'd clear up some confusion and draw some pictures, for the future benefit of anyone else like me trying to untangle this mess with the help of google ;)

First up; what does a TUN client network 'look like'? Here's how I visualise it:

real NIC | v ------| <-192.168.0.10 (real NIC IP) | |--------------------------------->THE WORLD |____| ^ <-192.168.0.10 (Pretend real NIC IP) | | <-Pretend network link | v <-10.0.0.1 (Pretend fake tun0 IP) ------| | | <-Fake 'tun0' NIC |____| ^ <- 10.0.0.1 (tun0 IP) | | <-Pretend connection to guest OS | v <- 10.0.0.x (ie 10.0.0.10) QEMU guest NIC IP ------| | | <-QEMU guest OS NIC |____|

In other words, conceptually, tun0 actually sits 'between' the host OS and the guest (whether this is actually the case or not I leave as an intellectual exercise for the reader). So, the tun0 address and the qemu guest OS address should be different, but should both be on the same subnet, and that subnet should be utterly different to anything else on your network. Intellectually I can appreciate the possibility that it shouldn't *have* to be on a different subnet, but in practice I met with utter and complete failure attempting to make tun0 and the guest OS NIC live on the same subnet as all the 'real' NICs.

Therefore, all traffic to the 'real world' goes from the guest, to tun0 (ie, in the guest OS set the address of the default gateway to be the tun0 address), and from tun0 out to the world via the real NIC address. To make all this work, you need a script that puts the house in order for tun0.

Your real NIC, because all of the traffic is routing through it, has to have an appropriate iptables rule to masquerade traffic, and ip forwarding needs to be enabled for the host OS. Here is a simple script that does everything necessary (ie, this is a working, complete qemu-ifup):

--------8<--------snip #!/bin/bash iptables --flush #Clear out all previous rules ('/etc/init.d/iptables stop' may also work) echo 1 >/proc/sys/net/ipv4/ip_forward #Enable IP forwarding for the host OS iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE #Enable masquerading on your real NIC so tun0 can get in and out ifconfig $1 10.0.0.1 #Bring up tun0 on a different subnet from the host --------8<--------snip

This should work as far as tun0 goes, but this kills all preset iptables
rules. Which is fine if you're not using iptables. To preserve other iptables
I've added Masquerade to my saved rules via:

From terminal as root
#iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE
#/etc/rc.d/init.d/iptables save

And eliminated the  "iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE "
command from my qemu-ifup script. Of course iptables must be set to start on
boot.

一個優秀的開源模擬器－QEMU

QEMU剛剛發佈的Accelerator帶來了QEMU革命，它不再是一個性能低等的模擬器
使用Accelerator的QEMU速度一下提升了5倍之多。OSNews報道:
This means you could theoretically run Windows (or another OS) on
a Linux machine at near native speeds without buying a commercial emulator.
今天趕緊升了級，果然不同反響，安裝了RHEL4,Win2k,速度很好，也沒有以前版本分出大於2G空間
造成硬盤檢查出錯的問題了，在模擬的linux中進行危險試驗爽多啦。 :)
QEMU安裝也比win4lin/vmware簡單，最主要QEMU是開源的，後兩者都還需要序列號 :(
雖然Accelerator是專利產品，不過也是免費使用的，只是在分發上有一些限制.
總之是不錯的東東嘍，快試試吧！
不會裝？不會用？看看我的安裝,使用心得吧 ^_^
注意：該方法僅適用於2.6.x內核，2.4內核請查看QEMU安裝文檔。
QEMU有兩種模擬方式：

1.完整的系統模擬：這種模式下，QEMU模擬一個完整的系統（比如說，整個PC），包括CPU和周邊的計算機設備。你可以不必重啟就可以同時運行不同的系統來調試系統代碼。

2.User mode模擬（只能在Linux宿主機下使用）：這種模式下，QEMU能夠在Linux下運行從一個CPU到另一個CPU的編譯過程。這通常可以被用來運行Wine模擬器或者是交錯式的編譯和調試。

這裡只說模擬整個PC :)

首先先下載，編譯，安裝：
目前只有CVS版本的QEMU才可以使用Accelerator(加速器)，所以需要編譯。而其他版本的QEMU，可以直接下載binary的，在/下解壓縮即可使用，十分方便。不像win4lin,vmware這些模擬器還需要給內核打補丁。而且還不是開源的，安裝比較麻煩。(至少我沒成功安裝過它們兩個)
不過目前binary還不能使用加速器，所以速度會慢很多。相信很快binary版本就會加入這個加速器啦。
畢竟加速器剛出來沒幾天，值得期待。。。。 ^_^
OK，言歸正傳，開始來編譯，安裝CVS版本的QEMU，並且加入Accelerator支持:
首先從 http://www.dad-answers.com/qemu/ 下載當前CVS版本的qemu
我的版本是：qemu-snapshot-2005-02-22_23.tar.bz2
再從 http://fabrice.bellard.free.fr/qemu/kqemu-0.6.2-1.tar.gz 下載回來加速器
注意：編譯kqemu需要內核源碼包，將與當前內核版本完全相符的源碼包解壓在/usr/src下
並且確保/lib/modules/`uname -r`/build是正確指向內核源碼目錄的，如下所示：
[root@LFS ~]#ls -l /lib/modules/2.6.10-lvm/build
lrwxrwxrwx 1 root root 21 Feb 22 12:50 /lib/modules/2.6.10-lvm/build -> /usr/src/linux-2.6.10/
如果沒有正確指向內核源碼目錄，使用ln -s 命令建立鏈接：
[root@LFS ~]#ln -s /usr/src/linux-2.6.10 /lib/modules/2.6.10-lvm/build
生成kqemu需要的內核源文件：
[root@LFS ~]#cd /usr/src/linux-2.6.10
/root ------------> /usr/src/linux-2.6.10
[root@LFS linux-2.6.10]#make mrproper
確保內核源碼純淨，保證kqemu編譯出來可用。
[root@LFS linux-2.6.10]#cp /boot/config-2.6.10 .config
將當前內核配置文件複製過來
[root@LFS linux-2.6.10]#make scripts/
生成kqemu需要的東東，沒有這步，編譯就會出錯。

做好後就可以開始編譯QEMU了，使用如下命令編譯，安裝：
[root@LFS ~]#tar jxvf qemu-snapshot-2005-02-22_23.tar.bz2
[root@LFS ~]#tar zxvf kqemu-0.6.2-1.tar.gz -C qemu-snapshot-2005-02-22_23/
使用-C qemu-snapshot-2005-02-22_23/ 將kqemu解壓到qemu的目錄中，讓qemu支持kqemu
[root@LFS ~]#cd qemu-snapshot-2005-02-22_23
/root ------------> /root/qemu-snapshot-2005-02-22_23
[root@LFS qemu-snapshot-2005-02-22_23]#./configure
...........略
KQEMU module configuration: --------->表示加入kqemu支持
kernel sources /lib/modules/2.6.10-lvm/build
kbuild type 2.6
[root@LFS qemu-snapshot-2005-02-22_23]#make
[root@LFS qemu-snapshot-2005-02-22_23]#make install
這樣，qemu就安裝到了/usr/local下，所有可執行文件在/usr/local/bin下，如果想安裝到/usr:
[root@LFS qemu-snapshot-2005-02-22_23]#./configure -->./configure --prefix=/usr

安裝好後，kqemu模塊安裝在: /lib/modules/2.6.10-lvm/misc/kqemu.ko
並且安裝腳本會自動在/dev/下創建一個kqemu設備：
[root@LFS linux-2.6.10]#ls -l /dev/kqemu
crw-rw-rw- 1 root root 250, 0 Feb 24 2005 /dev/kqemu
[root@LFS linux-2.6.10]#
加載kqemu模塊：
[root@LFS linux-2.6.10]#modprobe kqemu
[root@LFS linux-2.6.10]#
使用lsmod命令檢查：
[root@LFS linux-2.6.10]#lsmod |grep kqemu
kqemu 41864 0
[root@LFS linux-2.6.10]#

如果你發現重啟後/dev/kqemu消失了，需要重新創建它：
[root@LFS linux-2.6.10]#mknod /dev/kqemu c 250 0
[root@LFS linux-2.6.10]#chmod 666 /dev/kqemu
你可以將上面兩行命令加入到系統的啟動腳本中，例如/etc/rc.d/rc.local
還有上面那個modprobe kqemu也一併加進去吧(如果你想每次系統啟動自動加載kqemu模塊)
注意的是，加到/etc/rc.d/rc.local時，命令最好寫絕對路徑，例如/sbin/modprobe

OK,現在QEMU,Accelerator都已經安裝完成，可以開始安裝OS啦。
這裡我以安裝RHEL4為例說一下qemu的基本用法，非常簡單！

首先創建磁盤鏡像文件:
[root@LFS distro]#qemu-img create redhat.img 6G
Formating 'redhat.img', fmt=raw, size=6291456 kB
[root@LFS distro]#
這樣就創建好了一個名為redhat.img的6G磁盤鏡像。
注意：創建的磁盤鏡像文件大小最好小於你實際分區剩餘空間。

開始安裝RHEL4:
[root@LFS ~]#qemu -boot d -cdrom /rhel4/EL_disc1.iso -hda redhat.img --enable-audio
-boot d ：從光驅引導 a(軟盤引導) c(硬盤引導) d(光驅引導)
-cdrom : ISO文件,也可以直接使用光驅設備(/dev/cdrom)...別忘了插入光盤 :)
-hda : 就是虛擬機裡的硬盤啦，也就是剛才qemu-img創建出的東東。
-enable-audio : 聲卡支持

安裝過程中，要求換盤：
在qemu中按ctrl+alt+2切換到qemu monitor模式輸入?或help可以查看可用命令及使用說明。
(在其他版本的qemu中，運行qemu加載OS後，這個shell就會自動變成qemu monitor模式)
change device filename -- change a removable media
看來它就是用來換盤的了 : change cdrom /rhel4/EL_disc2.iso

monitor下還有幾個常用的命令：
savevm filename 將整個虛擬機當前狀態保存起來
loadvm filename 恢復 (最初我沒用change換盤時,就是先savevm->重新運行qemu->loadvm :( )
sendkey keys 向VM中發送按鍵，例如你想在虛擬機裡切換到另一個終端，按下了ctrl-alt-F2
不幸的是，切換的卻是你的主系統，所以就需要用 sendkey了 sendkey ctrl-alt-f2
還有其他幾個命令，自己看看啦。

經過N久終於裝好了，現在可以啟動試試：
[root@LFS distro]#qemu redhat.img -enable-audio -user-net -m 64
-user-net 相當於VMware的nat，主系統可以上，虛擬機就可以
-m 64 使用64M內存，缺省下使用128M

ctrl-alt-f 全屏
ctrl-alt 主機/虛擬機鼠標切換
qemu還有一些其他參數，輸入qemu可以查看其相關說明。

Good Luck ! ^_^

相關資源：
http://fabrice.bellard.free.fr/qemu/
qemu 主頁 download,doc,faq....etc
http://www.dad-answers.com/qemu/
QEMU CVS Snapshot版本和一些有用的QEMU外圍支持工具
http://www.dad-answers.com/qemu-forum/
qemu論壇

2006-02-06

FreeBSD 5.1安裝 VMware 的全部過程

FreeBSD 5.1安裝 VMware 的全部過程
作者：UNIX中文寶庫發文時間：2005.04.25

這篇文章描述了在 FreeBSD 5.1-Release 安裝 VMware 的全部過程。FreeBSD 5.1 支持安裝 VMware 的 3.2.1-2237 版本，但是其最新版本已經 3.2.1-2242 版本，因此在使用系統的 ports collection 安裝時會出現一些問題，在本文中將對這些問題進行解決。

FreeBSD 5.1 支持安裝 VMware 的 3.2.1-2237 版本，但是其最新版本已經 3.2.1-2242 版本，因此在使用系統的 ports collection 安裝時會出現一些問題，在本文中將對這些問題進行解決。

我也嘗試過在 FreeBSD 5.1 中安裝 VMware 4，但是由於 FreeBSD 5.1 的 Linux 相容模式在 /compat/linux/sbin 下缺少對 lsmod 的模擬，所以沒有成功，這個問題只能等待之後 FreeBSD 的主版本或者 port collection 升級之後才能繼續進行嘗試了。當然，也有可能就是你看到這篇文章的時候（當前時間 9:30 PM 7/30/2003），這些東西都已經過時，但是起碼可以提供給你一種解決問題的思路。

首先你需要到 VMware 網站上下載 VMware 3 的最新版本，在我寫這篇文章的時候，最新版本是 3.2.1-2242。下載下來之後的檔案名是 VMware-workstation-3.2.1-2242.tar.gz。

然後到 http://people.freebsd.org/~mbr/vmware 下載 vmmon-only-3.2.1-20030514.tar.gz 和 vmnet-only-3.2.1-20030412.tar.gz 這兩個文件。

把這三個文件放到 /usr/ports/distfiles 下。

在一切開始之前，確認你安裝了 FreeBSD 5.1 的 Linux 相容模式，並且在 rc.conf 中打開了這樣的模式。具體的檢查辦法是輸入 kldstat 指令，如果看到 linux.ko 字樣說明已經成功安裝相容模式。如果沒有看到，那麼用這樣的辦法安裝：

#cd /usr/ports/emulators/linux_base8
#make install clean

安裝完成之後檢查確認 rc.conf 中已經有 linux_enable = "YES"，然後重新啟動之後用 kldstat 應該可以看到 linux.ko 字樣。此時可以在 /compat/linux 下看到 linux 的 bin，usr，sbin，mnt 等。

一切準備妥當之後，第一步是要編輯 /usr/ports/emulators/vmware3 下的 Makefile 和 distinfo 使得其可以適應 2242 版的 vmware 軟體的特性。

首先備份原有的 Makefile 和 distinfo 為 Makefile.2237 和 distinfo.2237。然後按照這個步驟來：

1，編輯 Makefile，把其中的 3.2.1-2237 字樣改成 3.2.1-2242（只有一處需要改）。

2，運行 #md5 VMware-workstation-3.2.1-2242.tar.gz 得到這個檔的 MD5 值，然後記下這個值。

3，編輯 distinfo，把其中的 3.2.1-2237 字樣改成 3.2.1-2242，把 3.2.1-2242 的 MD5 值改成我們剛才得到的那個值。這裏一定不能弄錯。否則無法開始安裝。

4，在 /usr/ports/emulator/vmware3 下運行 # make install 開始安裝。

5，閃過一堆資訊之後，出現一個藍色背景的螢幕問你是否使用橋接網路。我個人感覺橋接網路比路由網路好用，所以選是，然後輸入你的網路設備名，比如 pcn0，ln0，dc0，fxp0 之類。

6，然後繼續安裝，閃過很多安裝過程。最後回到提示符下。這個時候可以測試是否一定成功安裝虛擬網卡，輸入 # /usr/local/etc/rc.d/vmware.sh start，然後 # ifconfig -a，如果看到一個名叫 vmnet1 的設備，那麼就恭喜你成功了！

7，由於使用的是 linux 相容方式，因此需要在 /etc/fstab 中加入一行：

/linproc /compat/linux/proc linprocfs rw 0 0

8，在 rc.conf 中配置一下 vmware 的虛擬網卡，然後重新啟動電腦。

9，重新啟動完畢之後，將 /usr/local/etc/vmware 下的 config 複製到 /root/.vmware 下。然後編輯這個檔。加入一行 webbrowser="mozilla %s"。

10，將 /usr/local/lib/vmware/lib 下的 licenses 目錄複製到 /usr/lib/vmware 下（/usr/lib/vmware 目錄默認不存在，你將需要自己建立這個目錄）。

11，運行 /usr/local/bin/vmware，然後在 help 裏面輸入序列號，開始使用吧！你已經成功在 FreeBSD 上運行了 VMware 3.2.1-2242，祝賀你！

你可以輸入下面這樣的序列號：

Serial = "6818X-84WD1-01KDK-3JN9X"
Name = "wasily"
CompanyName = "mcn"

在開始使用的時候，還會遇到很多問題，比如滑鼠，網卡等等方面的問題，這個時候你就只能進行進一步地研究了。這裏是我發現的一些技巧，用來解決這些可能會發生的問題：

1，滑鼠

如果你要在 VMware 中安裝 Windows，那麼滑鼠是必須的。如果你用 VMware 的默認方法配置滑鼠那麼多半沒法使用，建議你自己調整一下。現在大家用的基本都是都是 PS/2 介面的滑鼠，把虛擬機裏面的滑鼠設置從從 sysmouse 調整為 ps/2 mouse 就可以正常使用了。

2，音效卡

VMware 3 對音效卡的類比很糟糕。如果想要實現聲音，最好還是等以後的版本了。不要在這個方面費力氣。而且即使是 Windows Server 2003 都好像沒有帶 VMware 3 中那個虛擬音效卡的驅動程式。

3，網卡

在安裝時，我們用 ifconfig -a 看到的虛擬網卡是 vmnet1，而用嚮導生成的默認設置中的網卡設備名是 vmnet0，所以在 power on 之前還需要修改一下，點 VMware 3 介面的 Settings 的 Configuration Editor 把網卡那裏改成 Custom，設備名寫 /dev/vmnet1 就可以了。

4，如果缺檔？

如果中途在用的時候 VMware 提示缺檔，那麼我建議你最好是把 /usr/local/lib/vmware/lib 下的所有目錄都複製到 /usr/lib/vmware 下！

以上就是我的一些經驗，希望對大家有幫助。這次這麼玩也是有點無奈+無聊來著，我個人最喜歡的是 bsd 三兄弟，但是公司裏面又經常要我寫什麼 .net，com+ 之類，沒辦法就這麼玩了呵呵！

http://network.ccidnet.com/art/215/20050425/243023_1.html