This shows you the differences between two versions of the page.
Next revision | Previous revision | ||
copying_data_to_the_nodes [2014/04/24 13:32] root created |
copying_data_to_the_nodes [2014/11/04 17:44] (current) root |
||
---|---|---|---|
Line 1: | Line 1: | ||
====== Copying Data to the Nodes ====== | ====== Copying Data to the Nodes ====== | ||
- | If you have a job that reads the same data file many times, or makes random accesses to a data file, it may be more efficient to have that data locally on a node than compete with other users to access the file server. | + | If you have a job that reads the same data file many times, or makes many "random" |
Each node has almost 1TB of space mounted on /tmp. This /tmp space is local to each node. | Each node has almost 1TB of space mounted on /tmp. This /tmp space is local to each node. | ||
Line 9: | Line 9: | ||
If what your program does is read a file strictly sequentially just once, this copy is unlikely to help. | If what your program does is read a file strictly sequentially just once, this copy is unlikely to help. | ||
- | There is a couple of options | + | There is a couple of options |
1) Do it directly in your script.. | 1) Do it directly in your script.. | ||
Line 15: | Line 15: | ||
< | < | ||
cp / | cp / | ||
- | run your process on the data in .tmp | + | ... Run your process on the data in /tmp ... |
+ | rm /tmp/ | ||
</ | </ | ||
(Really you would use mktemp to get a unique name to avoid clashes.) | (Really you would use mktemp to get a unique name to avoid clashes.) | ||
+ | |||
+ | Should be careful if you have multiple copies of your script running on a node: you could be copying the data multiple times. | ||
2) Copy to all nodes allocated to your task using sbcast. | 2) Copy to all nodes allocated to your task using sbcast. |