Thursday 7 March 2013

Sometimes you need to flush

One of the great things about the 200-series nodes (X and S) are that you can specify how much memory or SSD's you want to add into a node.  Fantastic! I can put 48GB of RAM and 2 SSD's (for metadata acceleration) in an X200 node to host my commodity data and 96GB RAM and 4 SSD's in an S200 node to support my high-performance storage requirements.

The issue here is that you could potentially be the first / only customer running a particular config.

So what happens when you send a shut down command to a node with 96GB RAM running OneFS 6.5.x well?  From some testing I ran at the start of this year it look 70 / 30 that the nodes will shut down as expected.  In the minority of cases the shut down is aborted, due to a timeout flushing data from memory.

To work around this issue you can run isi_flush before issuing the shutdown command.  Testing of the flush before shut down proved to increase success to 100%, so we have a fix until we have a fix.


As you might expect, you can run isi_flush through through isi_for_array to flush all nodes in a cluster prior to a shut down.

isi_for_array "isi_flush"

Interestingly, only the shut down command is impacted by the memory flush, reboots always work - go figure.

No comments:

Post a Comment