This describes how to have N arbitrary programs read a copy of a single input stream.
If you have big data on the shared central storage it is sometimes most efficient to have it read by a number of different programs on the same node at the same time. For example, if your input is a large file and you want to know the number of lines, grep out lines containing SOMETHING and compute the MD5 checksum. These are three IO bound operations that need sequential access to the input. The naive method will transfer the file three times from the shared storage to the worker node(s):
More complex schemes with multiple input and output files are of course possible (add examples here if you have them). Of course, this construct is only of use if your processes need sequential access to the same input and are IO bound on it.