The discussion forums in the XSEDE User Portal are for users to share experiences, questions, and comments with other users and XSEDE staff. Visitors are welcome to browse and search, but you must login to contribute to the forums. While XSEDE staff monitor the lists, XSEDE does not guarantee that questions will be answered. Please note that the forums are not a replacement for formal support or bug reporting procedures through the XSEDE Help Desk. You must be logged in to post to the user forums.

« Back to Stampede Forum

Optimal Job Size for the SLURM Scheduler?

Combination View Flat View Tree View
Threads [ Previous | Next ]
Optimal Job Size for the SLURM Scheduler?
3/21/14 3:12 PM
I developed my experiment setup on Kraken and ported to Stampede a month and a half ago, and for the first month it was great. In the last week or so, however, my queue throughput has tanked pretty severely. On Kraken, I think the scheduler favored large jobs, except if you could make them small enough to run as "backfill," and I managed to split my experiments up into small enough chunks that it was frequently running as backfill. This brings me to my question: what is the optimal job size? Is it set up the same as Kraken to where it favors larger jobs, but will backfill? If so, how small do jobs need to be to run as backfill? Waiting 4 hours for 32-core-for-20-minute jobs to go through is getting a little old.