Glasgow DPM RFIO TCP NODELAY

From ScotGrid

I wanted to test the DPM fix for dCache to DPM file transfers. Greig had been doing some transfers from Edinburgh's dCache into Glasgow's DPM when the head node (se2-gla) had its dteam only pool writable. He noticed a significant failure rate, so I wondered if setting the TCP_NODELAY option had badly affected performance.

Testing to se2-gla

Tested using 10 file FTS transfer from Edinburgh's DPM.

Image:Ed2gla_se2_nodelay_set_then_unset.png

Load is going very high on the machine, particularly CPU usage in latter half of transfer (as soon as first transfers have completed?). When NODELAY was unset 5 files failed to transfer (50%), when it was set 3 files failed to transfer (30%). Failures were in SRM layer, e.g., failing to do 'setDone' on target SRM.

When the NODELAY option was set there was some evidence of high Rcvd-Q values internal to DPM:

tcp   480771      0 se2-gla.scotgrid.ac.u:20000 dpm.epcc.ed.ac.uk:50004     ESTABLISHED
tcp   237686      0 se2-gla.scotgrid.ac.u:20001 dpm.epcc.ed.ac.uk:50005     ESTABLISHED
tcp   485298      0 se2-gla.scotgrid.ac.u:20001 dpm.epcc.ed.ac.uk:50006     ESTABLISHED
tcp   480771      0 se2-gla.scotgrid.ac.u:20001 dpm.epcc.ed.ac.uk:50007     ESTABLISHED
tcp   237703      0 se2-gla.scotgrid.ac.u:20000 dpm.epcc.ed.ac.uk:50000     ESTABLISHED
tcp   488235      0 se2-gla.scotgrid.ac.u:20000 dpm.epcc.ed.ac.uk:50001     ESTABLISHED
tcp   480767      0 se2-gla.scotgrid.ac.u:20000 dpm.epcc.ed.ac.uk:50002     ESTABLISHED
tcp   488054      0 se2-gla.scotgrid.ac.u:20000 dpm.epcc.ed.ac.uk:50003     ESTABLISHED
tcp   480784      0 se2-gla.scotgrid.ac.u:20001 dpm.epcc.ed.ac.uk:50008     ESTABLISHED
tcp   485294      0 se2-gla.scotgrid.ac.u:20001 dpm.epcc.ed.ac.uk:50009     ESTABLISHED

Confirms result that DPM daemons seem to fail to communicate properly under high load.

se2 has a single IDE disk, but I did check that DMA is on.

Testing to pool1-gla

Set se2 and pool2 pools to RDONLY, to force transfers onto pool1 (which has large fibre channel disk arrays). Repeated test.

Image:Ed2gla_pool1_nodelay_set_then_unset.png

Machine performs pretty much identically in both cases. None of the files fail to transfer, rate is ~180Mb/s.

I'm thinking that having the DPM head node as a disk pool is very bad for DPM!

Conclusion

export RFIO_TCP_NODELAY=yes seems to be OK, but I'm concerned as to why transferring into se2 has become so unreliable. This was definitely not observed before Xmas.