NavigationMember Login |
Distributed High Performance ComputingBackground: Distributed high performance computing was the initial impetus for the Thebes consortium, and continues to be the most developed. In this model we add the complexity of job schedulers at each high performance compute (HPC) devices as well as the transport of data sets in and out of the HPC devices. Users locate available resources via the resource discovery network, but in the HPC example there is a dynamic aspect to the metadata in the RDN, as how busy a resource is plays a key component in the decision to use a resource or not. The Thebes service installed on the resources will filter SAML (Cantor, Hodges, Morgan) assertions, check them against the policy enforcement point, and pass appropriate work to the local job submission tool. Actors: Systems administrators: Administrators at each enterprise will connect the identity provider to the local identity store and install a local resource discovery network node. This node will be introduced to one or more external RDN nodes. They will also optionally establish an enterprise level policy administration point. As resource administrators connect compute or file system resources to the network, they will install the Thebes service on each resource, create policies, and publish their resource to the RDN. Each client computer at each shop and corporate offices needs custom client software that plugs into the Thebes infrastructure. User authentication will be accomplished via the Thebes plug-in. This is equally true of local users and remote queries. Researchers: The submit tool for Thebes is a simple Java installation, and will accept a username and password and perform the necessary work to obtain a signed assertion from the enterprise identity provider. Additionally, this tool will accept from the user a detailed description of the job in a format that is well understood by the popular job schedulers, as well as all the requirements of the job. When the researcher submits the job, it is sent to a high level scheduler that continuously collects dynamic data from all HPC resources known to the resource discovery network. It can either return an ordered set of resources to the user to choose from, or it can automatically select the most appropriate resource to submit the work to. Local Management: In this case, local management can represent the various division and departments heads that lie between the upper management and the researcher. In some cases, it may be appropriate for these positions to assign policies to the resources that fall in their domain that are more stringent then the overall enterprise policies. In some cases, this layer of policy control might relieve the resource owners of the need for additional policies. If the system is set up to collect accounting information, management can use the data collected to cost share or invoice for computational time, or to justify expenditures to funding agencies. Senior Staff: If Thebes is going to be used to cross administrative domains, there may need to be senior staff buy-in and participation to protect local interests and satisfy legal requirements. Generally, sharing resources will require agreements between each institution involved in the exchange, with expectations, limitations, responsibilities and requirements spelled out. Once this is in place, the policies agreed upon will have to be codified in the policy administration tools, which will represent the minimum set of restrictions that comply with the agreements.
|