MSHEAR Binary Patch 0531-01

R.W. Busby 26 Sep 99

Caltech/USGS TriNet Seismic Network

 

An archive file msbp990926.slip.lzh is available to correct an error in SLIP communication links that can render the link inoperable. This patch applies to MSHEAR 36/09-0531. It is also backwards compatible to all previous versions of MSHEAR. It is the first patch of this release.

Description of the problem

The datalogger enters a condition whereby all SLIP communication is disabled. The system sends IP packets out the serial port but no packets are accepted in return. The serial port of the datalogger is not polled for input and hence no incoming slip packets are received. The ifslip process transmitting packets monopolizes the serial port and does not allow the ifslip process receiving packets access.

 

In our case, the trigger for entering the problem state was the onset of acknowledges to the SEC comlink during a period when the PRI comlink was still retransmitting frequently but receiving no acknowledgments. An error in the configuration parameters exacerbated the situation by configuring the comlink resend packet window size to be six packets instead of two, when used with a very short resend timeout of 2 seconds. (See configuration details below) In lab simulations, we used ws=16, resendpkts=6 and resend=10 for normal operations and could trigger the problem state by changing resend=2 via the option K of the aqshell menu. In our case, killing the dacommo process could reactivate the link.

 

The flooding option of a comlink does not cause the slip driver to hang. It seems gross overload of packets to the driver causes it to discard packets and it does not hang. There appear to be only certain conditions in which the driver attempts to keep up transmission at the expense of reception.

 

Clear indication of the problem is identified through the slipstat reports. A view of this status at 5 minute intervals revealed the outbound packets increased while the inbound packets do not. Also, the number of polls of the serial port do not increase. See example diagnostic output in the diagnostic details below.

 

Solution to the Problem

A revised slip driver (ifslip) works in combination with a revised serial port descriptor (5x8530) to correctly arbitrate for the serial port. The serial port driver is available for Q4120 and Q730. A version for Q680 systems is not yet available. The new slip driver will work with older versions of the serial port descriptor (including the Q680 version) to avoid the error, though better performance is achieved with the new combination. The revised files as offered as a binary lharc archive at; ftp://quake.geo.berkeley.edu/pub/quanterra/mshear/release/msbp990926.slip.lzh

This archive includes a copy of the old version and the new version of each module, as well as update the file actually used by the system to the new version.

 

To install the archive, transfer the binary file to the datalogger (to /h0/HOLDING). Extract the file by entering;

chd /h0

lharc –xf holding/msbp990926.slip.lzh

 

The version of these modules released with MSHEAR 36/09-0531 were;

/h0/isp98/cmds/ifslip version 11 crc $2686D0 edition #21

/h0/overlays/4120/5x8530 version 23 crc $A391B3 edition #22

 

The new versions are;

/h0/isp98/cmds/ifslip version 12 crc $1D0880 edition #21

/h0/overlays/4120/5x8530 version 26 crc $DD5D03 edition #26

 

Use the crc to uniquely identify the module. The determine what version is loaded in memory, enter;

sysop: ident -m ifslip

Header for: ifslip

Module size: $26EC #9964

Owner: 0.0

Module CRC: $1D0880 Good CRC <<==== look at this value

Header parity: $2564 Good parity

Edition: $15 #21

Ty/La At/Rev $E01 $A001

Permission: $555 -----e-r-e-r-e-r

Dev Drv, 68000 obj, Sharable, System State Process

 

For more details of the versions and the filesystem organization see Version Details below.

 

Additional Details Section

Diagnostic Details

VSP) date

August 19, 1999 Thursday 10:00:06 pm

 

VSP) slipstat /sl2

 

=======================================================================

 

IFSLIP Device Information Statistics:

-----------------------------------------------------------------------

Device = sl2 Driver = ifslip

MTU = 1006 bytes

Flags = 0x0132 [ BROADCAST PT_TO_PT NO-TRAILERS NO-ARP ]

 

if_this = 0x00e38190 if_next = 0x00eb5290 if_prev = 0x00e30010

if_static = 0x00e31990 if_size = 0x000000d0

 

Socket Address (Internet Style):

-----------------------------------------------------------------------

Address Family = 2 IP Port = 0 IP Address = 131.215.58.9

 

IFSLIP Driver Static Storage:

-----------------------------------------------------------------------

Input Output

--------------- ---------------

 

Serial Device: /x2 /x2

Process ID's: 19 20

Compression: OFF OFF

Mbuf Queue Head: 0x00000000 0x00e0ca10

Bytes In/Out: 68908970 1438618817

IP Packets In/Out: 1823507 2844113

Compressed Packets: 187430 0

Uncompressed Packets: 23299 0

Biggest IP Packet: 414 552

Smallest IP Packet: 3 40

Errors: 409832 0

Reopens: 0 0

System path: 20 21

Death Flag: 0 0

 

Mbuf Size: 4096

Failed InMbuf Alloc: 0

Runts: 6896

GS_READY Polls: 9621394

SS_SIG Waits: 9621394

 

IFSLIP Device Descriptor Options:

-----------------------------------------------------------------------

Serial Device - Input: /x2

Serial Device - Output: /x2

Process Priority - Input: 128

Process Priority - Output: 128

Receive Buffer Size: 4096

Compression: OFF

Parity-Stop Bits-Bits/Char: 0x00

Baud Rate Code: 0x0f

 

 

=======================================================================

 

VSP) ifcontrol /sl2

mbuf control module revision: 1

total mbuf size: 393216

total allocated: 15552

minimum reserve: 49152

failed attempts: 0

allocation mode: NO WAIT

looking for if control module ifi.83D73A09

if control module revision: 1

ip address: 83D73A09

if device name: sl2

total queued on xmit: 13700

xmit queue limit: 15000

discarded xmit bytes: 257912912

discarded xmit packets: 470644

total queued on recv: 0

discarded recv packets: 0

queued bytes in serial xmit buffer: 1024

total input packets: 1406780

total output packets: 2844169

 

A second report is obtained five minutes later.

 

VSP) date

August 19, 1999 Thursday 10:06:12 pm

 

VSP) slipstat /sl2

 

=======================================================================

 

IFSLIP Device Information Statistics:

-----------------------------------------------------------------------

Device = sl2 Driver = ifslip

MTU = 1006 bytes

Flags = 0x0132 [ BROADCAST PT_TO_PT NO-TRAILERS NO-ARP ]

 

if_this = 0x00e38190 if_next = 0x00eb5290 if_prev = 0x00e30010

if_static = 0x00e31990 if_size = 0x000000d0

 

Socket Address (Internet Style):

-----------------------------------------------------------------------

Address Family = 2 IP Port = 0 IP Address = 131.215.58.9

 

IFSLIP Driver Static Storage:

-----------------------------------------------------------------------

Input Output

--------------- ---------------

 

Serial Device: /x2 /x2

Process ID's: 19 20

Compression: OFF OFF

Mbuf Queue Head: 0x00000000 0x00e27350

Bytes In/Out: 68908970 1439210109

IP Packets In/Out: 1823507 2845192

Compressed Packets: 187430 0

Uncompressed Packets: 23299 0

Biggest IP Packet: 414 552

Smallest IP Packet: 3 40

Errors: 409832 0

Reopens: 0 0

System path: 20 21

Death Flag: 0 0

 

Mbuf Size: 4096

Failed InMbuf Alloc: 0

Runts: 6896

GS_READY Polls: 9621394

SS_SIG Waits: 9621394

 

IFSLIP Device Descriptor Options:

-----------------------------------------------------------------------

Serial Device - Input: /x2

Serial Device - Output: /x2

Process Priority - Input: 128

Process Priority - Output: 128

Receive Buffer Size: 4096

Compression: OFF

Parity-Stop Bits-Bits/Char: 0x00

Baud Rate Code: 0x0f

 

 

=======================================================================

 

VSP) infcontrol /sl2

mbuf control module revision: 1

total mbuf size: 393216

total allocated: 13248

minimum reserve: 49152

failed attempts: 0

allocation mode: NO WAIT

looking for if control module ifi.83D73A09

if control module revision: 1

ip address: 83D73A09

if device name: sl2

total queued on xmit: 12056

xmit queue limit: 15000

discarded xmit bytes: 258174308

discarded xmit packets: 471121

total queued on recv: 0

discarded recv packets: 0

queued bytes in serial xmit buffer: 604

total input packets: 1406780

total output packets: 2845252

 

 

 

 

Module Version Details

The slip driver is located in /h0/isp98/cmds as ifslip.11 for the old version and ifslip.12 as the new version. The module used by the system is /h0/isp98/cmds/ifslip and is a copy of ifslip.12.

 

sysop: ident ifslip.11

Header for: ifslip

Module size: $264A #9802

Owner: 0.0

Module CRC: $2686D0 Good CRC

Header parity: $25C2 Good parity

Edition: $15 #21

Ty/La At/Rev $E01 $A001

Permission: $555 -----e-r-e-r-e-r

Dev Drv, 68000 obj, Sharable, System State Process

 

sysop: ident ifslip.12

Header for: ifslip

Module size: $26EC #9964

Owner: 0.0

Module CRC: $1D0880 Good CRC

Header parity: $2564 Good parity

Edition: $15 #21

Ty/La At/Rev $E01 $A001

Permission: $555 -----e-r-e-r-e-r

Dev Drv, 68000 obj, Sharable, System State Process

 

 

The serial port descriptor for Q4120 and Q730 systems is located in /h0/overlays/4120. The old version is located in a subdirectory 23/5x8530 while the new version is in another subdirectory 26/5x8530. The module used by the system is /h0/overlays/4120/5x8530 and is a copy of 26/5x8530. The module used by Q680 systems is /h0/overlays/147/5x8530 or /h0/overlays/00/5x8530 depending on the CPU type. Installing the lharc file will not update these Q680 serial port descriptors but the slip driver will avoid the error.

 

Module released with MSHEAR 36/09-0531

sysop: ident 23/5x8530

Header for: 5x8530

Module size: $1120 #4384

Owner: 0.0

Module CRC: $A391B3 Good CRC

Header parity: $1E75 Good parity

Edition: $16 #22

Ty/La At/Rev $E01 $A001

Permission: $555 -----e-r-e-r-e-r

Dev Drv, 68000 obj, Sharable, System State Process

 

Revised Module:

sysop: ident 26/5x8530

Header for: 5x8530

Module size: $1160 #4448

Owner: 0.0

Module CRC: $DD5D03 Good CRC

Header parity: $1E79 Good parity

Edition: $1A #26

Ty/La At/Rev $E01 $A001

Permission: $555 -----e-r-e-r-e-r

Dev Drv, 68000 obj, Sharable, System State Process

 

Configuration Details

 

Portion of desired Key file for a SLIP link:

ws1 6

ws2 6

resend1 2

resend2 2

rspkt1 3

rspkt2 3

 

This produces the desired comlink configuration section of aqcfg:

* comlink section for IP mode on pri

*

[pri]

levels=32 mprio=20 port=35145 ipaddr=131.215.63.5 pkts=2500

fmt=QSL rce=y

resend=2 maxresends=15 synctime=20 ws=6

resendpkts=3 netdly=120 netto=60 delay=5

grpsize=1 grpto=0 detprio=14 timeprio=24

notify=y station=GOR udp=y keepnew=y

*

*

* comlink section for IP mode on sec

*

[sec]

levels=32 mprio=20 port=37145 ipaddr=131.215.63.6 pkts=2500

fmt=QSL rce=y

resend=2 maxresends=15 synctime=20 ws=6

resendpkts=3 netdly=120 netto=60 delay=5

grpsize=1 grpto=0 detprio=14 timeprio=24

notify=y station=GOR udp=y keepnew=y

*

 

Portion of the key file which caused the error when both comlinks became active. Note that the correct key is resendpkts=%rspkt1%.

 

ws1 6

ws2 6

resend1 2

resend2 2

rspkts1 3

rspkts2 3

 

That produces the desired comlink configuration section of aqcfg, the default

keyvalue of 6 is used for resendpkt, rather than 2:

 

* comlink section for IP mode on pri

*

[pri]

levels=32 mprio=20 port=35145 ipaddr=131.215.63.5 pkts=2500

fmt=QSL rce=y

resend=2 maxresends=15 synctime=20 ws=6

resendpkts=6 netdly=120 netto=60 delay=5

grpsize=1 grpto=0 detprio=14 timeprio=24

notify=y station=GOR udp=y keepnew=y

*

*

* comlink section for IP mode on sec

*

[sec]

levels=32 mprio=20 port=37145 ipaddr=131.215.63.6 pkts=2500

fmt=QSL rce=y

resend=2 maxresends=15 synctime=20 ws=6

resendpkts=6 netdly=120 netto=60 delay=5

grpsize=1 grpto=0 detprio=14 timeprio=24

notify=y station=GOR udp=y keepnew=y

*