MATLAB DISTRIBUTED COMPUTING SERVER 4 - SYSTEM ADMINISTRATORS GUIDE Instrukcja Użytkownika

Przeglądaj online lub pobierz Instrukcja Użytkownika dla Serwery MATLAB DISTRIBUTED COMPUTING SERVER 4 - SYSTEM ADMINISTRATORS GUIDE. Distributed Computing Server System Administrator`s Guide Instrukcja obsługi

  • Pobierz
  • Dodaj do moich podręczników
  • Drukuj
  • Strona
    / 148
  • Spis treści
  • BOOKMARKI
  • Oceniono. / 5. Na podstawie oceny klientów

Podsumowanie treści

Strona 1 - System Administrator’s Guide

MATLAB®Distributed ComputingServer™System Administrator’s GuideR2013b

Strona 2 - Natick, MA 01760-2098

1 Introduc tionMATLAB Distributed Computing Server Product DescriptionPerform MATLAB®and Simulin k®computations on clusters, clouds,and gridsMATLAB Di

Strona 3 - Revision History

3 Prod uct InstallationStep 3: Validate Cluster ProfileIn this step you valid ate your cluster profile, and thereby your installation.1 If it is not a

Strona 4

Configure for a Generic SchedulerNote If your validation fails any stage, contact the MathWorks installsupport team.If your validation passed, you n o

Strona 5 - Contents

3 Prod uct Installation3-54

Strona 6

4Admin Center• “Start Admin Center” on page 4-2• “Set Up Resources” on page 4-3• “Test Connectivity” on page 4-11• “Export and Im po rt Sessions” o n

Strona 7

4 Admin C enterStar t Admin CenterAdmin Center is a graphical user interface with which y ou can control andmonitor the MATLAB Distributed Computing S

Strona 8

Set Up ResourcesSet Up ResourcesIn this section...“Add Hosts” on page 4-3“Start mdce Service” on page 4-4“Start an MJS” on page 4-5“Start Workers” on

Strona 9 - Introduction

4 Admin C enterStart mdce ServiceA host must be running the mdce service if an MJS or worker is to run on thathost. Normally, you set this up with Adm

Strona 10 - 1 Introduc tion

Set Up ResourcesA dialog box leads you through the procedure of starting the mdce service onthe selected h osts. There are five steps to the procedure

Strona 11 - Product Overview

4 Admin C enterIn the New MATLAB Job Scheduler dialog b ox, provide a name for the MJS,and s elect a host to run it on.Alternative methods for startin

Strona 12

Set Up ResourcesStart WorkersTo start MATLAB workers, click Start in the Workers module.In the Start Workers dialog box, s pecify the numbers of worke

Strona 13 - Toolbox and Server Components

Product OverviewProduct OverviewIn this section...“Parallel Com puting Concepts” on page 1-3“Determining Product Installation and Versions” on page 1-

Strona 14

4 Admin C enterAlternative methods for starting workers include s electing the pull-downWorkers > Start, or right-click in g a li sted host or MJS

Strona 15

Set Up ResourcesTo get more info rmation on any host, MJS, or worker listed in A dm in Center,right-click its name in the display and select Propertie

Strona 16

4 Admin C enterMove a WorkerTo move a worker from one host to another, you must completely shut it down,than start a new worker on the desired host:1

Strona 17

Test ConnectivityTest ConnectivityAdmin Center lets you test communications between your MJS node, workernodes, and the node where A dm in Center is r

Strona 18

4 Admin C enterWhen the tests are complete, the Running Tests dialog box automaticallycloses, and Admin C enter displays the test results in the Conne

Strona 19 - Network Administration

Test ConnectivityTest that include failures or other results might look like the following figure.Double-click any of the symbols in the test results

Strona 20 - 2 Netw ork Administration

4 Admin C enterExpor t and Impor t SessionsBy default, Admin C enter saves the cluster definition, process status, andtest results, so the next time t

Strona 21 - Fully Qualified Domain Names

Prepare for Cluster ProfilesPrepare for Cluster ProfilesAdmin Cente r does not create cluster profiles, but the inform ation displaye din Admin Center

Strona 22

4 Admin C enter4-16

Strona 23 - Install and Configure

5Control Scripts —Alphabetical List

Strona 24

1 Introduc tionMATLAB WorkerSchedulerMATLAB ClientParallelComputingToolboxMATLAB DistributedComputing ServerMATLAB WorkerMATLAB DistributedComputing S

Strona 25

admincenterPurpose Start Admin Center GUISyntax admincenterDescription admincenter opens the MATLAB Distributed Computing Server AdminCenter. When set

Strona 26

createSharedSecretPurpose Create shared secret for secure communicationSyntax createSharedSecretcreateSharedSecret -file <filename>Description c

Strona 27

mdcePurpose Install, start, stop, or uninstall mdce serviceSyntax mdce installmdce uninstallmdce startmdce stopmdce consolemdce restartmdce ... -mdced

Strona 28

mdcemdce stop stops running the m dce service. This automatically stops alljob m anagers and workers on the computer, but leaves their checkpointinfor

Strona 29 - <MyJobManager> -v

nodestatusPurpose Status of mdce processes running on nodeSyntax nodestatusnodestatus -flagsDescription nodestatus displays the status of the mdce ser

Strona 30

nodestatusFlagOperation-baseport <port_number>Specifies th e base port th at themdce service on the remote hostis using. You need to specify thi

Strona 31 - Custom Star tup Parameters

remotecopyPurpose Copy file or folder to or from one or more remote hosts using transportprotocolSyntax remotecopy <flags><protocol options&g

Strona 32

remotecopyFlags and OptionsOperation-quietPrevent remotecopy from prompting formissing information. The command fails ifall required information is no

Strona 33 - Override Script Defaults

remotecopyRetrieve folders of the same name from two hosts to the local machine.(Enter command on a single line.)remotecopy -local C:\temp\log -from -

Strona 34

remotemdcePurpose Execute mdce command on on e or more remote hosts by transportprotocolSyntax remotemdce <mdce options><flags><protoco

Strona 35 - Access Serv ice R ecord Files

Toolbox and Server ComponentsToolbox and Server ComponentsIn this section...“Schedulers, Workers, and Clients” on page 1-5“Third-Party Schedulers” on

Strona 36

remotemdceFlags and OptionsOperation-protocol <type>Force the usage of a particular protocoltype. Specifying a protocol type with all itsrequire

Strona 37 - SECURITY_LEV EL

remotemdceStart mdce in a clean state on two UNIX operating system machinesfrom a W indow s operating syste m machine, using the ssh protocol.Enter th

Strona 38

star tjobmanagerPurpose Start job manager p rocessSyntax startjobmanagerstartjobmanager -flagsDescription startjobmanager starts a job manager process

Strona 39 - SetSecureCommunication

star tjobmanagerFlagOperation-cleanDeletes all checkpoint information storedon disk from previous instance s of this jobmanager b efore starting . Thi

Strona 40

star tworkerPurpose Start MA TLAB w orker sessionSyntax startworkerstartworker -flagsDescription startworker starts a MATL AB worker process under the

Strona 41 - Troubleshoot Common Problems

startworkerFlagOperation-jobmanagerhost <job_manager_hostname>Specifies the host on which the jobmanager is running. The worker contactsthe job

Strona 42

star tworkerStart two workers, named worker 1 and w orke r2, on the hostWorkerHost, registering with the job manager MyJobManager that isrunning on th

Strona 43 - Required Ports

stopjobmanagerPurpose Stop job manager processSyntax stopjobmanagerstopjobmanager -flagsDescription stopjobmanager stops a job manager that is running

Strona 44

stopjobmanagerFlagOperation-baseport <port_number>Specifies th e base port th at themdce service on the remote hostis using. You need to specify

Strona 45 - Host Communicatio ns Problems

stopworkerPurpose Stop MATLAB worker sessionSyntax stopworkerstopworker -flagsDescription stopworker stops a MATLAB worker process that is running und

Strona 46

1 Introduc tionWorkerSchedulerClientWorkerWorkerClientJobAll ResultsJobAll ResultsTaskResultsTaskResultsTaskResultsInteractions of Parallel Computing

Strona 47

stopworkerFlagOperation-baseport <port_number>Specifies th e base port th at themdce service on the remote hostis using. You need to specify thi

Strona 48

GlossaryGlossaryCHECKPOINTBASEThenameoftheparameterinthemdce_def file that defines the locationof the checkpoint directories for the MATLAB job schedu

Strona 49 - Product Installation

Glossarydistributed applicationThe same application that runs independently o n several nodes, possiblywith different input parameters. There is no co

Strona 50

Glossaryhomogeneous clusterA cluster of identical machines, in terms of both hardware and software.independent jobA job compose d of independent tasks

Strona 51 - Client Node

Glossarymdce_d ef fileThe file that defines all the defaults for the mdce processes by allowingyou to set preferences or definitions in the form of pa

Strona 52

Glossaryspmd (single program multiple data)A block of code that ex ecutes simultaneously on multiple w orke rs ina parallel pool. Each worker can oper

Strona 54

IndexIndexAadmincenter control script 5-2administrationnetwork 2-1Ccheckpoint folderlocating 2-18clean statestarting services 2-16clientprocess 1-5con

Strona 55

IndexRremotecopy control script 5-8remotemdce control script 5-11requirements 2-3Sschedulerthird-party 1-6security 2-4startjobmanager control script 5

Strona 56

Toolbox a nd Server Componentsscheduler, PBS Pro scheduler, TORQUE schedu ler, m p iexec, or a genericscheduler.Choosing Between a Scheduler and MJSYo

Strona 57

1 Introduc tion• Who administers your cluster?The person administering your cluster might have a preference for howjobs are scheduled.Components on Mi

Strona 58

Using Parallel Computing Toolbox™ SoftwareUsing Parallel Computing Toolbox SoftwareA typical Parallel Computing Toolbox client s ession includes the f

Strona 59

1 Introduc tion1-10

Strona 60

2Network AdministrationThis chapter provides information useful for network administration ofParallel Comp u t in g T o ol bo x sof twa re and MATL AB

Strona 61

How to Contact MathWorkswww.mathworks.comWebcomp.soft-sys.matlab Newsgroupwww.mathworks.com/contact_TS.html Technical [email protected] Pro

Strona 62 - MyMJS to run on host node1

2 Netw ork AdministrationPrepare for Parallel ComputingIn this section...“Plan Your Network Layout” on page 2-2“Network Requirements” on page 2-3“Full

Strona 63

Prepare for Parallel Computingrunning on all machines that run job manager sessions or workers that areregistered with a job manager. (The mdce servic

Strona 64

2 Netw ork AdministrationSecurity ConsiderationsThe parallel computing products do not provide any security measures.Therefore, be aware of the follow

Strona 65

Install and ConfigureInstall and ConfigureTo find the most up-to-date instructions for installing and configuringthe current or past versions of the p

Strona 66

2 Netw ork AdministrationUse Different MPI Builds on UNIX SystemsIn this section...“Build MPI” on page 2-6“Use Your MPI Build” on page 2-6Build MPITo

Strona 67

Use Different MPI Builds on UNIX®Systems1 Test your build by running the mpiexec executable. The build should beready to test if itsbin/mpiexec and li

Strona 68

2 Netw ork Administrationany), together. Set the configuration’s MpiexecFil eNam e property to/opt/mpich2/mpich2-1.4.1p1/bin/mpiexec.• If you are usin

Strona 69 - Time (UNIX)

Shut Down a Job Manager ClusterShut Down a Job Manager ClusterIn this section...“UNIX and Macintosh Operating Systems” on page 2-9“Microsoft Windows O

Strona 70

2 Netw ork AdministrationIfyouhavemorethanoneworkersessionrunning,youcanstopeachofthem individually by host and name.stopworker -name worker1 -remoteh

Strona 71

Shut Down a Job Manager ClusterMicrosoft Windows Operating SystemsStop the Job Manager and WorkersEnter the commands of this section at the prompt in

Strona 72

Revision HistoryNovember 2005 Online only New for Version 2.0 (Release 14SP3+)December 2005 Online only Revised for V ersion 2.0 (Release 14SP3+)March

Strona 73

2 Netw ork Administrationservice while leaving the machine on, enter the following commands a t aDOS com m and prompt:cd matlabroot\toolbox\distcomp\b

Strona 74

Custom Startup ParametersCustom Star tup ParametersIn this section...“Define Script Defaults” on page 2-13“Override Script Defaults” on page 2-15The M

Strona 75

2 Netw ork AdministrationNote If you want to run more than one job manager on the same machine,they must all have unique nam es. Spe cify the names us

Strona 76 - Configure for HPC Server

Custom Startup ParametersPrivilegePurposeLocal Security SettingsPolicySeServiceLogonRightRequired to log on using theservice logon type.Log on as a se

Strona 77 - CLUSTER_NAME.Ifyou

2 Netw ork AdministrationAlternatively, you can make a copy of this file, modify the copy, and specifythat this copy be used for the default parameter

Strona 78

Access Service Record FilesAccess Serv ice R ecord FilesIn this section...“Locate Log Files” on page 2-17“Locate Checkpoint F olders” on page 2-18The

Strona 79 - Configure for HP C Server

2 Netw ork AdministrationLocate Checkpoint FoldersCheckpoint folders contain information related to persistence data, whichthe server services use to

Strona 80

Set MJS Cluster SecuritySet MJS Cluster SecurityIn this section...“Set the Security Level” on page 2-19“Local, MJS, and Network Passwords” on page 2-2

Strona 81

2 Netw ork AdministrationSecurityLevelDescription User Requi re ments• Tasks run as the user who started themdce process on the worker machines(typica

Strona 82

Set MJS Cluster SecuritySecurityLevelDescription User Requi re mentsyour system/network user name andpassword, because the worker mustlog you in to ru

Strona 84

2 Netw ork AdministrationYou must also provide a value for the SHARED_SECRE T_FILE parameter in themdce_def file, identifying where the file can be fo

Strona 85 - TORQUE Scheduler

Troubleshoot Common ProblemsTroubleshoot Common ProblemsIn this section...“License Errors” on page 2-23“Memory Errors on UNIX Operating Systems” on pa

Strona 86 - JobStorageLocation

2 Netw ork Administration• If you receive this error w hen starting a worker with MATLAB DistributedComputing Server software:- You may be calling the

Strona 87 - 3 Click Validate

Troubleshoot Common Problems- If you installed only the Parallel Computing T oolbox product, and youare attempting to run a worker on the same machine

Strona 88

2 Netw ork AdministrationWith Third-Party SchedulerBefore the worker processes start, you can control the range of ports used bythe workers for commun

Strona 89

Troubleshoot Common ProblemsEphemeral TCP Ports with Job ManagerIf you use the jobmanager on a cluster of nodes running Windows operatingsystems, you

Strona 90

2 Netw ork AdministrationWith Command-Line InterfaceFirst, be sure that the machines in question agree on their IP resolutions. TheIP address for a pa

Strona 91

Troubleshoot Common ProblemsVerify Multicast CommunicationsNote Multicast is required on the head node running the MATLAB jobscheduler (MJS) and on th

Strona 92

2 Netw ork AdministrationThe following example shows how to use the Java class inside MATLAB.Start MATLA B on two machines (e.g.,host1name and h ost2

Strona 93 - Using Passwordless Delegation

3Product Installation• “Install Products and Choose Cluster Configuration” on page 3-2• “ConfigureforanMJS”onpage3-5• “Configure for HPC Server” on pa

Strona 94

ContentsIntroduction1MATLAB Distributed Computing Server ProductDescription... 1-2Key Features...

Strona 95

3 Prod uct InstallationInstall Products and Choose Cluster ConfigurationIn this section...“Cluster Descriptio n ” on page 3 -2“Install Products” on pa

Strona 96

Install Products and Choose Cluster ConfigurationMDCS ClusterClient Node PCTProduct Installations on Client NodesInstall ProductsOn the Cluster Node

Strona 97 - \toolbox\local

3 Prod uct InstallationConfigure Your ClusterWhen the c luster an d client insta l lations are complete, you can proceed toconfigure the products for

Strona 98

Configure for an MJSConfigure for an MJSIn this section...“Configure Cluster to Use a MATLAB Job Scheduler (MJS)” on page 3-5“Configure Windows Firew

Strona 99 - @deleteJobFcn

3 Prod uct InstallationStep 1: Set Up Windows Cluster HostsIf this is the first installation of MATLAB Distributed C omputing Serveron a cluster of W

Strona 100 - 3 Prod uct Installation

Configure for an MJSmatlabroot\toolbox\distcomp\bin\mdce_def.bat2 Find the line for setting the MDCEUSER parameter, and p rovide a value inthe f ormdo

Strona 101

3 Prod uct Installationcd oldmatlabroot\toolbox\distcomp\bin3 Sto p and uninstall th e old mdce service and remove its associated files b ytyping the

Strona 102

Configure for an MJSUsing A d min Center GUI.Note To use Admin Center, you must run it on a computer that hasdirect network connectivity to all the n

Strona 103 - Admin Center

3 Prod uct Installationb Click Add or Find.The Add or Find Hosts dialog box opens.c Select Enter H ostnam es , then list your hosts in the text box. Y

Strona 104 - Star t Admin Center

Configure for an MJSKeep the check to start mdce service.d Click OK to open the Start mdce service dialog box. Proceed through thesteps clicking Next

Strona 105 - Set Up Resources

Use Your MPI Build ... 2-6Shut Down a Job Manager Cluster... 2-9UNIX and Macintosh Operating Systems...

Strona 106 - 4 Admin C enter

3 Prod uct InstallationIt might take a moment for Admin Center to communicate with all thenodes, start the services, and acquire the status of all of

Strona 107 - Start an MJS

Configure for an MJSIf any of the connectivity tests fail, double-click the icon that indicates afailure to get in formation about tha t sp ecif ic te

Strona 108

3 Prod uct Installationa T o start an MJS (job m an a ge r), c lick Start in the MJS module. (Th is isone of several ways to open the New MJS dialog b

Strona 109 - Start Workers

Configure for an MJSe Click OK to start the workers and return to the Admin Center dialogbox. It might take a moment for Admin Center to initialize al

Strona 110

3 Prod uct InstallationIf you encounter any problems or failures, contact the MathWorks installsupport team.For more information about Admin Center fu

Strona 111

Configure for an MJSCommand Window,andselectRun as A dministrator.Thisoptionis available only if you are running User Account Control (UAC).ii If you

Strona 112

3 Prod uct Installation2 Start the MJSTo start the MATLAB job scheduler (MJS), enter the following comm andsin a DOS command window. You do not have t

Strona 113 - Test Connectivity

Configure for an MJScd matlabroot\toolbox\distcomp\binb Start the workers on each node, using the text for <MyMJS> that identifiesthe name of th

Strona 114

3 Prod uct Installationindicate protoco l, platform (such as in a mixed environment), or othe rinformation, see the help forremotemdce by typing./remo

Strona 115

Configure for an MJScd matlabroot/toolbox/distcomp/binb Start the workers on each node, using the text for <MyMJS> that identifiesthe name of th

Strona 116 - .mdcs tothefilename

Configure C luster to Use a MATLAB Job Scheduler(MJS)... 3-5Configure Windows Firewalls on Client...

Strona 117 - Prepare for Cluster Profiles

3 Prod uct InstallationDebian, Fedora Platforms. On each cluster node, register the mdce serviceas a known service and configure it to start automatic

Strona 118

Configure for an MJS4 L ook in /etc/initt ab for the default run level. Create a link in the rcfolder associated with that run level. For example, if

Strona 119 - Alphabetical List

3 Prod uct Installationsudo ln -s matlabroot/toolbox/distcomp/bin/mdce /usr/sb in/m dce3 Copy the launchd .plist file for m dce to /Library/LaunchDa e

Strona 120 - Syntax admincenter

Configure for an MJS1 On the client computer where Parallel Computing Toolbox is installed,openaDOScommandwindow(forWindowssoftware)orashell(forUNIXso

Strona 121 - See Also mdce

3 Prod uct Installation5 Click Done to sa ve your cluster profile.Step 3: Validate the Cluster ProfileIn this step you valid ate your cluster profile,

Strona 122 - Syntax mdce install

Configure for an MJSNote If your validation does not pass, contact the MathWorks install supportteam.If your validation passed, you now have a val id

Strona 123

3 Prod uct InstallationConfigure for HPC ServerIn this section...“Configure Cluster for Microsoft Windows HPC Server” on page 3-28“Configure Client Co

Strona 124 - Syntax nodestatus

Configure for HP C ServerNote If you need to override the script default values, modify thevalues defined inMicrosoftHPCServerSetup.xml before running

Strona 125

3 Prod uct InstallationNote Ifyouneedtooverridethedefaultvaluesthescript,modifythe values defined inMicrosoftHPCServerSetup.xml before runningMicrosof

Strona 126

Configure for HP C Serverb Set the NumWorkers field to the number of w orkers y ou want to runthe validation t ests o n, within the limitation o f you

Strona 127

Test Connectivity ... 4-11Export and Im port Sessions... 4-14Prepare for Cluster Profiles...

Strona 128 - See Also remotemdce

3 Prod uct Installation5 Click Done to sa ve your cluster profile.Step 2: Validate the ConfigurationIn this step you valid ate your cluster profile, a

Strona 129

Configure for HP C ServerNote If your validation does not pass, contact the MathW orks install supportteam.If your validation passed, you n ow have a

Strona 130

3 Prod uct InstallationConfigure for PBS Pro, Platform LSF, TORQUEIn this section...“Configure Platform LSF Scheduler on Windows Cluster” on p ag e 3-

Strona 131 - See Also mdce

Configure for PBS Pro, Platform LSF, T ORQUETo use mpiexec to distribute a job, the smpd service must be running on allnodes that will be used for run

Strona 132 - Syntax startjobmanager

3 Prod uct Installation4 If you are using Windows firewalls on your cluster nodes, execute thefollowing in a DOS command window.matlabroot\toolbox\dis

Strona 133

Configure for PBS Pro, Platform LSF, T ORQUEshared installation), e xecute the following comm and in a DOS commandwindow.matlabroot\bin\matlab.bat -in

Strona 134 - Syntax startworker

3 Prod uct Installation1 Start the Cluster Profile Manager from the MA TLAB desktop by selectingon the Home tab in the Environment area Parallel >

Strona 135 - -remotehost WorkerHost

Configure for PBS Pro, Platform LSF, T ORQUE5 Click Done to sa ve your cluster profile.Step 2: Validate the Cluster ProfileIn this step you verify you

Strona 136

3 Prod uct InstallationNote If your validation does not pass, contact the MathW orks install supportteam.If your validation passed, you n ow have a va

Strona 137 - Syntax stopjobmanager

Configure for a Generic SchedulerConfigure for a Generic SchedulerIn this section...“Interfacing with Gene ric Schedulers” on page 3-42“Configure Gene

Strona 138

1Introduction• “MATLAB®Distributed Computing Server™ Product Description” onpage 1-2• “Product Overview” on page 1-3• “Toolbox and Server Components”

Strona 139 - Syntax stopworker

3 Prod uct InstallationInterfacing with Generic Schedulers• “Support Scripts” on page 3-42• “Submission Mode” on page 3-42Support ScriptsTo support us

Strona 140

Configure for a Generic SchedulerBefore using the support scripts, decide which submission mode describesyour particular network setup.Configure Gener

Strona 141 - Glossary

3 Prod uct Installation2 Start smpd by typing in a DOS command window one of the following,as appropriate:matlabroot\bin\win32\smpd -installormatlabro

Strona 142 - Glossary-2

Configure for a Generic Scheduler8 Repeat all these steps on all Window s nodes in your cluster.Using Passwordless Delegation1 Log in as a user with a

Strona 143 - Glossary-3

3 Prod uct InstallationConfigure Sun Grid Engine on Linux ClusterTo run communicating jobs with MATLAB Distributed Computing Serverand Sun™ Grid Engin

Strona 144 - Glossary-4

Configure for a Generic Schedulerqconf -mq all.qThis will bring up a text editor for you to make changes: search for the linepe_list,andaddmatlab.5 En

Strona 145 - Glossary-5

3 Prod uct InstallationNote The remainder of this chapter illustrates only the case of using LSF ina nonshared file sy stem. For other schedulers or a

Strona 146 - Glossary-6

Configure for a Generic SchedulerIn this type of configuration, job data is copied from the client host runninga Windows operating system to a host on

Strona 147

3 Prod uct Installation2 Start the Cluster Profile Manager from the MA TLAB desktop by selectingParallel > Manage Cluster Profiles.3 Create a new p

Strona 148

Configure for a Generic Schedulerg Set the OperatingSystem to the operating system of your clusterworker machines.h Set HasSharedFilesystem to false,

Komentarze do niniejszej Instrukcji

Brak uwag