NRAO Home  >  Green Bank  |  Wiki Topic:    GB > Operate > SpectrometerRecoveryProcedures (r1.1 vs. r1.20)
   Changes | Index | Contents | Search | Go
 <<O>>  Difference Topic SpectrometerRecoveryProcedures (r1.20 - 06 Feb 2008 - DavidRose)
Changed:
<
<

-- DavidRose - 22 Mar 2004

>
>

-- DavidRose - 6 Feb 2008


 <<O>>  Difference Topic SpectrometerRecoveryProcedures (r1.19 - 29 Nov 2004 - RichardLacasse)
Changed:
<
<

Continuous serial line errors (more than 1 per second) cycle power
>
>

Continuous serial line errors (more than 1 per second) reset; if that does not work, cycle power

 <<O>>  Difference Topic SpectrometerRecoveryProcedures (r1.18 - 27 Aug 2004 - DavidRose)

 <<O>>  Difference Topic SpectrometerRecoveryProcedures (r1.17 - 22 Mar 2004 - DavidRose)
Changed:
<
<

hard reboot
At the workstation earth, turn the power off using the small black switch on the front panel inside the door. Inside the spectrometer cabinet press the Reset button on the MVME167-32A. Following the reboots insure the program spectrometer_init is completed (see ps command noted above) and then do a systemstart. The observer will need to reconfigure the Spectrometer.
>
>

hard reboot
At the workstation earth, turn the power off using the small black switch on the front panel inside the door. Inside the spectrometer cabinet press the Reset button on the MVME167-32A. Following the reboots insure the program spectrometer_init is completed (see ps command noted above) and then do a systemstart. The observer will need to reconfigure the Spectrometer.
Changed:
<
<

cycle power
In the spectrometer cabinet, turn System Power to OFF, turn System Monitor to OFF, wait 5 seconds, turn System Monitor to ON, and turn System Power to ON. This will take a couple of minutes to complete; one can verify it has started by the "initializing hardware" message. This action does not require reconfiguration of the software.
>
>

cycle power
In the spectrometer cabinet, turn System Power to OFF, turn System Monitor to OFF, wait 5 seconds, turn System Monitor to ON, and turn System Power to ON. This will take a couple of minutes to complete; one can verify it has started by the "initializing hardware" message. This action does not require reconfiguration of the software.
Changed:
<
<

-- DavidRose - 19 Mar 2004

>
>

-- DavidRose - 22 Mar 2004


 <<O>>  Difference Topic SpectrometerRecoveryProcedures (r1.16 - 19 Mar 2004 - DavidRose)
Changed:
<
<

hard reboot
At the workstation earth, turn the power off using the small black switch on the front panel inside the door. Inside the spectrometer cabinet press the Reset button on the MVME167. Following the reboots insure the program spectrometer_init is completed (see ps command noted above) and then do a systemstart. The observer will need to reconfigure the Spectrometer.
>
>

hard reboot
At the workstation earth, turn the power off using the small black switch on the front panel inside the door. Inside the spectrometer cabinet press the Reset button on the MVME167-32A. Following the reboots insure the program spectrometer_init is completed (see ps command noted above) and then do a systemstart. The observer will need to reconfigure the Spectrometer.
Changed:
<
<

cycle power
In the spectrometer cabinet, turn System Power to OFF, turn System Monitor to OFF, wait 5 seconds, turn System Monitor to ON, and turn System Power to ON. This will take a couple of minutes to complete; one can verify it has started by the "initializing hardware" message. This action does not require reconfiguration of the software.
>
>

cycle power
In the spectrometer cabinet, turn System Power to OFF, turn System Monitor to OFF, wait 5 seconds, turn System Monitor to ON, and turn System Power to ON. This will take a couple of minutes to complete; one can verify it has started by the "initializing hardware" message. This action does not require reconfiguration of the software.
Changed:
<
<

-- DavidRose - 08 Mar 2004

>
>

-- DavidRose - 19 Mar 2004


 <<O>>  Difference Topic SpectrometerRecoveryProcedures (r1.15 - 08 Mar 2004 - DavidRose)
Changed:
<
<

Info 

The character ^ is used below to emphasize spaces between dialog elements.

>
>

Info 

The character ^ is used below to emphasize spaces between dialog elements.

The Cleo TaskMaster application may also be used to complete the systemstop actions shown below.

Problems and Actions 

Problem Action
A single data interrupt failure Rerun the scan (even if the system has recovered, the message will not clear until another scan has been run)
A second data interrupt failure reset
Multiple data interrupt failures reboot
A DMA failure Rerun the scan
Bad data, e.g. odd looking lags, steps or jumps in the ACF link to example plot planned reset
Bad data continues after reset self test
Spectrometer locks up, i.e., remains hung in a state despite repeated aborts or is not responsive restart
System power and/or system monitor lights are found off cycle power
Cannot log into earth either from the console or remotely If possible call someone in computer division or software division, else as a last resort do a hard reboot
Spectrometer program (TaskMaster, Spectrometer or Transporter) will not die even using kill -9 reboot
The command ps shows that a spectrometer program (TaskMaster, Spectrometer or Transporter) is not running restart
Continued failures or other problems Contact software division callout person(s)
Bad data keeps being generated in spite of repeated resets. self test
Self test fails Contact digital division callout person(s).
Restart or reboot fails Contact software division callout person(s)
Continuous serial line errors (more than 1 per second) cycle power
Intermittent serial line errors (every few seconds) Note in OpsLog and ignore

New Spectrometer problems should be entered into this list by operators to mark the need for new actions. For diagnostic purposes, it's important that these actions be followed and fully recorded in OpsLog.

Changed:
<
<

Problems and Actions 

Problem Action
A single data interrupt failure Rerun the scan (even if the system has recovered, the message will not clear until another scan has been run)
A second data interrupt failure reset
Multiple data interrupt failures reboot
A DMA failure Rerun the scan
Bad data, e.g. odd looking lags, steps or jumps in the ACF link to example plot planned reset
Bad data continues after reset self test
Spectrometer locks up, i.e., remains hung in a state despite repeated aborts or is not responsive restart
System power and/or system monitor lights are found off cycle power
Cannot log into earth either from the console or remotely If possible call someone in computer division or software division, else as a last resort do a hard reboot
Spectrometer program (TaskMaster, Spectrometer or Transporter) will not die even using kill -9 reboot
The command ps shows that a spectrometer program (TaskMaster, Spectrometer or Transporter) is not running restart
Continued failures or other problems Contact software division callout person(s)
Bad data keeps being generated in spite of repeated resets. self test
Self test fails Contact digital division callout person(s).
Restart or reboot fails Contact software division callout person(s)
Continuous serial line errors (more than 1 per second) cycle power
Intermittent serial line errors (every few seconds) Note in OpsLog and ignore

New Spectrometer problems should be entered into this list by operators to mark the need for new actions.

For purposes of diagnostics, it is important that these actions be followed and fully recorded in the OpsLog.

-- MarkClark - 05 Mar 2004

>
>

-- DavidRose - 08 Mar 2004


 <<O>>  Difference Topic SpectrometerRecoveryProcedures (r1.14 - 05 Mar 2004 - MarkClark)
Deleted:
<
<

Abort causes the Spectrometer to lock up Bypass ScanCoordinator/GO and use Stop in Cleo's Spectrometer window in lieu of Abort to end a scan; be patient, the Spectrometer will complete the current integration

 <<O>>  Difference Topic SpectrometerRecoveryProcedures (r1.13 - 05 Mar 2004 - MarkClark)
Added:
>
>

Abort causes the Spectrometer to lock up Bypass ScanCoordinator/GO and use Stop in Cleo's Spectrometer window in lieu of Abort to end a scan; be patient, the Spectrometer will complete the current integration
Changed:
<
<

-- MarkClark - 04 Mar 2004

>
>

-- MarkClark - 05 Mar 2004


 <<O>>  Difference Topic SpectrometerRecoveryProcedures (r1.12 - 04 Mar 2004 - MarkClark)
Added:
>
>

Bad data continues after reset self test
Added:
>
>

Continuous serial line errors (more than 1 per second) cycle power
Intermittent serial line errors (every few seconds) Note in OpsLog and ignore
Changed:
<
<

For purposes of diagnostics, it is important that these actions be followed and recorded in the OpsLog.

>
>

For purposes of diagnostics, it is important that these actions be followed and fully recorded in the OpsLog.

Changed:
<
<

-- MarkClark - 02 Mar 2004

>
>

-- MarkClark - 04 Mar 2004


 <<O>>  Difference Topic SpectrometerRecoveryProcedures (r1.11 - 02 Mar 2004 - MarkClark)
Changed:
<
<

self test
From the CLEO Spectrometer window select Change Configuration..., then select Testing, then Next and then Finish. In the Testing tab select Test Using Interrupts. Be patient. At the conclusion of the test a window will pop up filled with tables of numbers. If the test failed then the table will contain non-zero values and a failure message will be generated.
>
>

self test
From the CLEO Spectrometer window select Change Configuration..., then select Testing, then Next and then Finish. In the Testing tab select Test Using Interrupts. Be patient. At the conclusion of the test a window will pop up filled with tables of numbers. If the test failed then the table will contain non-zero values and a failure message will be generated. The observer will need to reconfigure the Spectrometer.
Changed:
<
<

A single data interrupt failure rerun the scan (even if the system has recovered, the message will not clear until another scan has been run)
>
>

A single data interrupt failure Rerun the scan (even if the system has recovered, the message will not clear until another scan has been run)
Changed:
<
<

Odd looking lags, steps or jumps in the ACF link to example plot planned reset
>
>

A DMA failure Rerun the scan
Bad data, e.g. odd looking lags, steps or jumps in the ACF link to example plot planned reset

 <<O>>  Difference Topic SpectrometerRecoveryProcedures (r1.10 - 02 Mar 2004 - MarkClark)
Changed:
<
<

This procedure provides guidance for correcting problems associated with the Spectrometer. For software related problems not described below, contact the Software Division. For hardware related problems, contact the Digital Group. See current callout lists for individuals responsible. Unless the observing situation/observing friend suggests otherwise, the callout person should be contacted. Refer to the current callout policy as needed.

>
>

This procedure provides guidance for correcting problems associated with the Spectrometer. This page should be referred to directly, i.e., not any copy of it, because it is expected the instructions will be changing on a regular basis. For software related problems not described below, contact the Software Division. For hardware related problems, contact the Digital Group. See current callout lists for individuals responsible. Unless the observing situation/observing friend suggests otherwise, the callout person should be contacted. Refer to the current callout policy as needed.

Added:
>
>

self test
From the CLEO Spectrometer window select Change Configuration..., then select Testing, then Next and then Finish. In the Testing tab select Test Using Interrupts. Be patient. At the conclusion of the test a window will pop up filled with tables of numbers. If the test failed then the table will contain non-zero values and a failure message will be generated.
Changed:
<
<

hard reboot
At the workstation earth, turn the power off using the small black switch on the front panel inside the door. Inside the spectrometer cabinet press the Reset button on the MVME167. Following the reboots insure the program spectrometer_init is completed (see PS command noted above) and then do a systemstart. The observer will need to reconfigure the Spectrometer.
>
>

hard reboot
At the workstation earth, turn the power off using the small black switch on the front panel inside the door. Inside the spectrometer cabinet press the Reset button on the MVME167. Following the reboots insure the program spectrometer_init is completed (see ps command noted above) and then do a systemstart. The observer will need to reconfigure the Spectrometer.
Changed:
<
<

cycle power
In the spectrometer cabinet, turn System Power to OFF, turn System Monitor to OFF, wait 5 seconds, turn System Power to ON, and turn System Monitor to ON. This will take a couple of minutes to complete; one can verify it has started by the "initializing hardware" message. This action does not require reconfiguration of the software.
>
>

cycle power
In the spectrometer cabinet, turn System Power to OFF, turn System Monitor to OFF, wait 5 seconds, turn System Monitor to ON, and turn System Power to ON. This will take a couple of minutes to complete; one can verify it has started by the "initializing hardware" message. This action does not require reconfiguration of the software.
Changed:
<
<

Cannot log into earth either from the console or remotely if possible call someone in computer division or software division, else do a hard reboot
Spectrometer program (TaskMaster, Spectrometer or Transporter) will not die reboot
Spectrometer program (TaskMaster, Spectrometer or Transporter) is not running restart
Continued failures or other problems contact software division callout person(s)
>
>

Cannot log into earth either from the console or remotely If possible call someone in computer division or software division, else as a last resort do a hard reboot
Spectrometer program (TaskMaster, Spectrometer or Transporter) will not die even using kill -9 reboot
The command ps shows that a spectrometer program (TaskMaster, Spectrometer or Transporter) is not running restart
Continued failures or other problems Contact software division callout person(s)
Bad data keeps being generated in spite of repeated resets. self test
Self test fails Contact digital division callout person(s).
Restart or reboot fails Contact software division callout person(s)
Changed:
<
<

-- DavidRose - 01 Mar 2004

>
>

For purposes of diagnostics, it is important that these actions be followed and recorded in the OpsLog.

-- MarkClark - 02 Mar 2004


 <<O>>  Difference Topic SpectrometerRecoveryProcedures (r1.9 - 01 Mar 2004 - DavidRose)
Changed:
<
<

The following procedures are used to resurrect the Spectrometer should it fail to function correctly, become hung in a particular state, or errors are noted (serial line, DMA, or load xilinx files errors for example). Use the procedures in the order prescribed unless directed otherwise.

>
>

This procedure provides guidance for correcting problems associated with the Spectrometer. For software related problems not described below, contact the Software Division. For hardware related problems, contact the Digital Group. See current callout lists for individuals responsible. Unless the observing situation/observing friend suggests otherwise, the callout person should be contacted. Refer to the current callout policy as needed.

Info 

The character ^ is used below to emphasize spaces between dialog elements.

Changed:
<
<

should also be used in the operator's log and call-outs are required.

cycle power
In the spectrometer cabinet, turn System Power to OFF, turn System Monitor to OFF, wait 5 seconds, turn System Power to ON, and turn System Monitor to ON. This will take a couple of minutes to complete; one can verify it has started by the "initializing hardware" message. This action does not require reconfiguration of the software.
>
>

should also be used when describing problem/failure entries in OpsLog and when consulting callout personnel.

Changed:
<
<

hard reboot
At the workstation earth, turn the power off using the small black switch on the front panel inside the door. Inside the spectrometer cabinet press the Reset button on the MVME167. Following the reboots insure the program spectrometer_init is completed and then do a systemstart. The observer will need to reconfigure the Spectrometer.
>
>

reset
From the CLEO Spectrometer Manager menu, select Reset parameters. This will take a couple of minutes to complete; one can verify it has started by the "initializing hardware" message. This action does not require reconfiguration of the software.
Changed:
<
<

reboot
Do a systemstop and then log into earth as root (either remotely or from earth's console). Note that the command listed below will end the login when it reboots the computer. After earth has finished rebooting, do a systemstart. The observer will need to reconfigure the Spectrometer.
>
>

systemstop
Log into earth as monctrl and enter the commands listed below. The last command prints all programs being run by monctrl. One should not see TaskMaster, Spectrometer or Transporter running.
Changed:
<
<

$ sync; init 6

>
>

$ source ^ /home/gbt/gbt.bash $ TaskMaster? ^ earth ^ systemstop $ ps ^ -u ^ monctrl

Deleted:
<
<

reset
From the CLEO's Spectrometer Managers menu select Reset parameters. This will take a couple of minutes to complete; one can verify it has started by the "initializing hardware" message. This action does not require reconfiguration of the software.

restart
Do a systemstop followed by a systemstart.
Changed:
<
<

$ ps -u monctrl $ source /home/gbt/gbt.bash $ TaskMaster? earth systemstart /home/gbt/etc/config/earthProc.conf $ ps -u monctrl

>
>

$ ps ^ -u ^ monctrl $ source ^ /home/gbt/gbt.bash $ TaskMaster? ^ earth ^ systemstart ^ /home/gbt/etc/config/earthProc.conf $ ps ^ -u ^ monctrl

Changed:
<
<

systemstop
Log into earth as monctrl and enter the commands listed below. The last command prints all programs being run by monctrl. One should not see TaskMaster, Spectrometer or Transporter running.
>
>

restart
Do a systemstop followed by a systemstart.

reboot
Do a systemstop and then log into earth as root (either remotely or from earth's console). Note that the command listed below will end the login when it reboots the computer. After earth has finished rebooting, do a systemstart. The observer will need to reconfigure the Spectrometer.
Changed:
<
<

$ source /home/gbt/gbt.bash $ TaskMaster? earth systemstop $ ps -u monctrl

>
>

$ sync; ^ init ^ 6

Added:
>
>

hard reboot
At the workstation earth, turn the power off using the small black switch on the front panel inside the door. Inside the spectrometer cabinet press the Reset button on the MVME167. Following the reboots insure the program spectrometer_init is completed (see PS command noted above) and then do a systemstart. The observer will need to reconfigure the Spectrometer.
Changed:
<
<

Problems and Actions 

>
>

cycle power
In the spectrometer cabinet, turn System Power to OFF, turn System Monitor to OFF, wait 5 seconds, turn System Power to ON, and turn System Monitor to ON. This will take a couple of minutes to complete; one can verify it has started by the "initializing hardware" message. This action does not require reconfiguration of the software.
Added:
>
>

Problems and Actions 

Changed:
<
<

single data interrupt failure rerun the scan (even if the system has recovered, the message will not clear until another scan has been run)
second data interrupt failure reset
multiple data interrupt failures reboot
odd looking lags, steps or jumps in the ACF need example plots here reset
spectrometer locks up, i.e., remains hung in a state despite repeated aborts or is not responsive restart
system power and/or system monitor lights are found off cycle power
cannot log into earth either from the console or remotely if possible call someone in computer division or software development, else do a hard reboot
spectrometer program (TaskMaster?, Spectrometer or Transporter) will not die reboot
spectrometer program (TaskMaster?, Spectrometer or Transporter) is not running restart
continued failures or other problems call someone in software development
>
>

A single data interrupt failure rerun the scan (even if the system has recovered, the message will not clear until another scan has been run)
A second data interrupt failure reset
Multiple data interrupt failures reboot
Odd looking lags, steps or jumps in the ACF link to example plot planned reset
Spectrometer locks up, i.e., remains hung in a state despite repeated aborts or is not responsive restart
System power and/or system monitor lights are found off cycle power
Cannot log into earth either from the console or remotely if possible call someone in computer division or software division, else do a hard reboot
Spectrometer program (TaskMaster, Spectrometer or Transporter) will not die reboot
Spectrometer program (TaskMaster, Spectrometer or Transporter) is not running restart
Continued failures or other problems contact software division callout person(s)
Changed:
<
<

Info 

For software related problems contact the Software Division. For hardware related problems, contact the Digital Group. See current callout lists for individuals responsible for the Spectrometer. The callout person should be contacted unless the observing situation/observing friend dictates otherwise. Refer to the current callout policy as needed.

Recovery Procedure 1: Use to correct for odd looking lags, steps, or jumps in the ACF.

Recovery Procedure 2: Use when the system locks up (hung in a state, not responsive), if procedure 1 fails to help, or as directed by the callout person.

Recovery Procedure 3: Use if the Spectrometer's System Power and/or System Monitor lights are found off, if procedure 2 fails to help, or as directed by the callout person.

The character " ^ " is used below to emphasize spaces between dialog elements.

Recovery Procedure 1

STEP ACTION
1 Use the Reset Parameters option of the Spectrometer menu to reset the hardware.
2 Proceed to recovery procedure 2 and/or call for assistance if the problem persists

Recovery Procedure 2

STEP ACTION
1 Abort the observation via the Scan Coordinator. If the system goes fatal and remains hung, continue to next step.
2 Check the Spectrometer cabinet System Monitor and System Power lights. If both are ON, continue to next step. If either are OFF, proceed to Recovery Procedure 3.
3 Go to the workstation earth and log in as root.
4 At the root prompt, enter: sync; ^ init ^ 6 (reboots earth).
5 Once earth has rebooted, go to victor (or whatever workstation you're operating from) and ensure correct sourcing. Enter: source ^ /home/gbt/gbt.bash
6 On this same workstation, log in as monctrl and run the TaskMaster command to restart earth's processes: TaskMaster ^ earth ^ systemstart ^ /home/gbt/etc/config/earthProc.conf
7 Monitor the message window and the Spectrometer application to ensure complete recovery.
8 Proceed to recovery procedure 3 or call for additional assistance if problems continue.

Recovery Procedure 3

STEP ACTION
1 Go to the workstation earth and turn the workstation power OFF (small black switch on front panel inside door)
2 Go to the Spectrometer cabinet, VME rack, and press the RESET button on MVME167.
3 In the Spectrometer cabinet: Toggle OFF System Power, Toggle OFF System Monitor, Wait 5 seconds, Toggle ON System Monitor, Toggle ON System power
4 Go to the workstation earth and turn the workstation power ON. The system will now reboot.
5 Once earth has rebooted, go to victor (or whatever workstation you're operating from) and ensure correct sourcing. Enter: source ^ /home/gbt/gbt.bash
6 On this same workstation, log in as monctrl and run the TaskMaster command to restart processes: TaskMaster ^ earth ^ systemstart ^ /home/gbt/etc/config/earthProc.conf
7 Monitor the message window and the Spectrometer application to ensure complete recovery.
8 Call for assistance if problems continue.

-- MarkClark - 01 Mar 2004

>
>

-- DavidRose - 01 Mar 2004


 <<O>>  Difference Topic SpectrometerRecoveryProcedures (r1.8 - 01 Mar 2004 - MarkClark)
Changed:
<
<

Glossary 

>
>

Action Definitions 

These terms are used to describes what procedures should be followed given specific spectrometer problems. They should also be used in the operator's log and call-outs are required.

cycle power
In the spectrometer cabinet, turn System Power to OFF, turn System Monitor to OFF, wait 5 seconds, turn System Power to ON, and turn System Monitor to ON. This will take a couple of minutes to complete; one can verify it has started by the "initializing hardware" message. This action does not require reconfiguration of the software.
Changed:
<
<

reboot
Do a systemstop and then log into earth as root (either remotely or from earth's console). Note that the command listed below will end the login when it reboots the computer. Following the command insure the program spectrometer_init is completed and then do a systemstart. The observer will need to reconfigure the Spectrometer.
>
>

reboot
Do a systemstop and then log into earth as root (either remotely or from earth's console). Note that the command listed below will end the login when it reboots the computer. After earth has finished rebooting, do a systemstart. The observer will need to reconfigure the Spectrometer.
Deleted:
<
<

recycle power
In the spectrometer cabinet, turn System Power to OFF, turn System Monitor to OFF, wait 5 seconds, turn System Power to ON, and turn System Monitor to ON. This will take a couple of minutes to complete; one can verify it has started by the "initializing hardware" message. This action does not require reconfiguration of the software.
Changed:
<
<

systemstart
Log into earth as monctrl and enter the commands listed below. The ps command prints all programs being run by monctrl. After the first ps command, one should not see TaskMaster, Spectrometer or Transporter running. After the last ps command, one should see all three programs running. The observer will need to reconfigure the Spectrometer.
>
>

systemstart
Log into earth as monctrl and enter the commands listed below. The ps command prints all programs being run by monctrl. After the first ps command, one should not see TaskMaster, Spectrometer, spectrometer_init, or Transporter running. If spectrometer_init is running, wait a couple of minutes and run ps again. After the last ps command, one should see TaskMaster, Spectrometer and Transporter running. The observer will need to reconfigure the Spectrometer.
Added:
>
>

Problems and Actions 

Problem Action
single data interrupt failure rerun the scan (even if the system has recovered, the message will not clear until another scan has been run)
second data interrupt failure reset
multiple data interrupt failures reboot
odd looking lags, steps or jumps in the ACF need example plots here reset
spectrometer locks up, i.e., remains hung in a state despite repeated aborts or is not responsive restart
system power and/or system monitor lights are found off cycle power
cannot log into earth either from the console or remotely if possible call someone in computer division or software development, else do a hard reboot
spectrometer program (TaskMaster?, Spectrometer or Transporter) will not die reboot
spectrometer program (TaskMaster?, Spectrometer or Transporter) is not running restart
continued failures or other problems call someone in software development

New Spectrometer problems should be entered into this list by operators to mark the need for new actions.

Changed:
<
<

-- DavidRose - 5 Feb 2004

>
>

-- MarkClark - 01 Mar 2004


 <<O>>  Difference Topic SpectrometerRecoveryProcedures (r1.7 - 01 Mar 2004 - MarkClark)
Added:
>
>

Glossary 

hard reboot
At the workstation earth, turn the power off using the small black switch on the front panel inside the door. Inside the spectrometer cabinet press the Reset button on the MVME167. Following the reboots insure the program spectrometer_init is completed and then do a systemstart. The observer will need to reconfigure the Spectrometer.

reboot
Do a systemstop and then log into earth as root (either remotely or from earth's console). Note that the command listed below will end the login when it reboots the computer. Following the command insure the program spectrometer_init is completed and then do a systemstart. The observer will need to reconfigure the Spectrometer.
            $ sync; init 6
recycle power
In the spectrometer cabinet, turn System Power to OFF, turn System Monitor to OFF, wait 5 seconds, turn System Power to ON, and turn System Monitor to ON. This will take a couple of minutes to complete; one can verify it has started by the "initializing hardware" message. This action does not require reconfiguration of the software.

reset
From the CLEO's Spectrometer Managers menu select Reset parameters. This will take a couple of minutes to complete; one can verify it has started by the "initializing hardware" message. This action does not require reconfiguration of the software.

restart
Do a systemstop followed by a systemstart.

systemstart
Log into earth as monctrl and enter the commands listed below. The ps command prints all programs being run by monctrl. After the first ps command, one should not see TaskMaster, Spectrometer or Transporter running. After the last ps command, one should see all three programs running. The observer will need to reconfigure the Spectrometer.
            $ ps -u monctrl
            $ source /home/gbt/gbt.bash
            $ TaskMaster earth systemstart /home/gbt/etc/config/earthProc.conf
            $ ps -u monctrl
systemstop
Log into earth as monctrl and enter the commands listed below. The last command prints all programs being run by monctrl. One should not see TaskMaster, Spectrometer or Transporter running.
            $ source /home/gbt/gbt.bash
            $ TaskMaster earth systemstop
            $ ps -u monctrl

 <<O>>  Difference Topic SpectrometerRecoveryProcedures (r1.6 - 29 Feb 2004 - DavidRose)

 <<O>>  Difference Topic SpectrometerRecoveryProcedures (r1.5 - 19 Feb 2004 - JohnFord)
Deleted:
<
<

Alternate Recovery Procedure: This is the method suggested by J. Brandt and seems to work real well.

Deleted:
<
<

Alternate Recovery Procedure

STEP ACTION
1 From any workstation, log in as monctrl: su ^ monctrl (password).
2 Change to the i386-linux directory: cd ^ /home/gbt/exec/i386-linux
4 Copy in the following file: cp ^ spectrometer.with_lock spectrometer (reboots the Spectrometer)
5 As monctrl, use TaskMaster to stop Spectrometer processes on earth: TaskMaster ^ earth ^ stop ^ 2
6 As monctrl, use TaskMaster to start Spectrometer processes on earth: TaskMaster ^ earth ^ start ^ 2
7 Monitor the message window and the Spectrometer application to ensure complete recovery.
8 Call for assistance if problems continue.

 <<O>>  Difference Topic SpectrometerRecoveryProcedures (r1.4 - 05 Feb 2004 - DavidRose)
Added:
>
>

Alternate Recovery Procedure: This is the method suggested by J. Brandt and seems to work real well.

Added:
>
>

Alternate Recovery Procedure

STEP ACTION
1 From any workstation, log in as monctrl: su ^ monctrl (password).
2 Change to the i386-linux directory: cd ^ /home/gbt/exec/i386-linux
4 Copy in the following file: cp ^ spectrometer.with_lock spectrometer (reboots the Spectrometer)
5 As monctrl, use TaskMaster to stop Spectrometer processes on earth: TaskMaster ^ earth ^ stop ^ 2
6 As monctrl, use TaskMaster to start Spectrometer processes on earth: TaskMaster ^ earth ^ start ^ 2
7 Monitor the message window and the Spectrometer application to ensure complete recovery.
8 Call for assistance if problems continue.
Changed:
<
<

-- DavidRose - 21 Jan 2004

>
>

-- DavidRose - 5 Feb 2004


 <<O>>  Difference Topic SpectrometerRecoveryProcedures (r1.3 - 21 Jan 2004 - DavidRose)
Changed:
<
<

>
>

Changed:
<
<

(serial line, DMA, or load xilinx files errors for example). Use the procedures in the order prescribed unless directed otherwise by the Digital Group.  

>
>

(serial line, DMA, or load xilinx files errors for example). Use the procedures in the order prescribed unless directed otherwise.

Changed:
<
<

For Software related problems contact the Software Division (Mark Clark). For hardware related problems, contact the Digital Group (Holly Chen, Rich Lacasse, or John Ford). 

Aborting via the Scan Coordinator should be attempted first as this will occasionally correct a hung state problem.

>
>

For software related problems contact the Software Division. For hardware related problems, contact the Digital Group. See current callout lists for individuals responsible for the Spectrometer. The callout person should be contacted unless the observing situation/observing friend dictates otherwise. Refer to the current callout policy as needed.

Changed:
<
<

correct for odd looking lags, steps, or jumps in the ACF.

>
>

correct for odd looking lags, steps, or jumps in the ACF.

Changed:
<
<

the system locks up (hung in a state, not responsive), if procedure 1 fails to help, or as directed by the Digital Group.

>
>

the system locks up (hung in a state, not responsive), if procedure 1 fails to help, or as directed by the callout person.

Changed:
<
<

procedure 2 fails to help, or as directed by the Digital Group.  

>
>

procedure 2 fails to help, or as directed by the callout person.

Changed:
<
<

Recovery Procedure 1

>
>

Recovery Procedure 1

STEP ACTION
1 Use the Reset Parameters option of the Spectrometer menu to reset the hardware.
2 Proceed to recovery procedure 2 and/or call for assistance if the problem persists

Recovery Procedure 2

STEP ACTION
1 Abort the observation via the Scan Coordinator. If the system goes fatal and remains hung, continue to next step.
2 Check the Spectrometer cabinet System Monitor and System Power lights. If both are ON, continue to next step. If either are OFF, proceed to Recovery Procedure 3.
3 Go to the workstation earth and log in as root.
4 At the root prompt, enter: sync; ^ init ^ 6 (reboots earth).
5 Once earth has rebooted, go to victor (or whatever workstation you're operating from) and ensure correct sourcing. Enter: source ^ /home/gbt/gbt.bash
6 On this same workstation, log in as monctrl and run the TaskMaster command to restart earth's processes: TaskMaster ^ earth ^ systemstart ^ /home/gbt/etc/config/earthProc.conf
7 Monitor the message window and the Spectrometer application to ensure complete recovery.
8 Proceed to recovery procedure 3 or call for additional assistance if problems continue.

Recovery Procedure 3

STEP ACTION
1 Go to the workstation earth and turn the workstation power OFF (small black switch on front panel inside door)
2 Go to the Spectrometer cabinet, VME rack, and press the RESET button on MVME167.
3 In the Spectrometer cabinet: Toggle OFF System Power, Toggle OFF System Monitor, Wait 5 seconds, Toggle ON System Monitor, Toggle ON System power
4 Go to the workstation earth and turn the workstation power ON. The system will now reboot.
5 Once earth has rebooted, go to victor (or whatever workstation you're operating from) and ensure correct sourcing. Enter: source ^ /home/gbt/gbt.bash
6 On this same workstation, log in as monctrl and run the TaskMaster command to restart processes: TaskMaster ^ earth ^ systemstart ^ /home/gbt/etc/config/earthProc.conf
7 Monitor the message window and the Spectrometer application to ensure complete recovery.
8 Call for assistance if problems continue.
Deleted:
<
<

STEP

ACTION
1 Use the Reset Parameters option of the Spectrometer menu to reset the hardware.
2 Proceed to recovery procedure
    1. or call for assistance if the problem persists.

Recovery Procedure 2

STEP

ACTION
1 Check the Spectrometer cabinet to inspect the System Monitor and System Power lights. If both are ON, continue to next step. If either are OFF, proceed to Recovery Procedure 3.
2 Go to the workstation earth and:
  • Turn workstation power OFF (small black switch on front panel inside door)
  • Wait 5 seconds
  • Turn workstation power ON

The system will now reboot.

3 Quit and re-launch Cleo Spectrometer.
4 Monitor the message window and Spectrometer to verify faults clear.

Recovery Procedure 3

STEP

ACTION
1 Go to the workstation earth and turn the workstation power OFF (small black switch on front panel inside door)
2 Go to the Spectrometer cabinet, VME rack, and press the RESET button on MVME167.
3 In the Spectrometer cabinet:
  • Toggle OFF System Power
  • Toggle OFF System Monitor
  • Wait 5 seconds
  • Toggle ON System Monitor
  • Toggle ON System power
4 Go to the workstation earth and turn the workstation power ON

The system will now reboot.

5 Quit and re-launch Cleo Spectrometer.
6 Monitor the message window and Spectrometer to verify faults clear.
Changed:
<
<

-- DavidRose - 03 Oct 2003

>
>

-- DavidRose - 21 Jan 2004


 <<O>>  Difference Topic SpectrometerRecoveryProcedures (r1.2 - 23 Oct 2003 - DavidRose)
Changed:
<
<

%META:TOPICPARENT{name="TemporaryGBTOpsArea"}%

>
>

%META:TOPICPARENT{name="TelescopeOperations"}%


 <<O>>  Difference Topic SpectrometerRecoveryProcedures (r1.1 - 03 Oct 2003 - DavidRose)
Added:
>
>

%META:TOPICINFO{author="DavidRose" date="1065191888" format="1.0" version="1.1"}% %META:TOPICPARENT{name="TemporaryGBTOpsArea"}%

TELESCOPE OPERATIONS


SPECTROMETER RECOVERY PROCEDURES

General

The following procedures are used to resurrect the Spectrometer should it fail to function correctly, become hung in a particular state, or errors are noted (serial line, DMA, or load xilinx files errors for example). Use the procedures in the order prescribed unless directed otherwise by the Digital Group.  

Info 

For Software related problems contact the Software Division (Mark Clark). For hardware related problems, contact the Digital Group (Holly Chen, Rich Lacasse, or John Ford). 

Aborting via the Scan Coordinator should be attempted first as this will occasionally correct a hung state problem.

Recovery Procedure 1: Use to correct for odd looking lags, steps, or jumps in the ACF.

Recovery Procedure 2: Use when the system locks up (hung in a state, not responsive), if procedure 1 fails to help, or as directed by the Digital Group.

Recovery Procedure 3: Use if the Spectrometer's System Power and/or System Monitor lights are found off, if procedure 2 fails to help, or as directed by the Digital Group.  

The character " ^ " is used below to emphasize spaces between dialog elements.

Recovery Procedure 1

STEP

ACTION
1 Use the Reset Parameters option of the Spectrometer menu to reset the hardware.
2 Proceed to recovery procedure
    1. or call for assistance if the problem persists.

Recovery Procedure 2

STEP

ACTION
1 Check the Spectrometer cabinet to inspect the System Monitor and System Power lights. If both are ON, continue to next step. If either are OFF, proceed to Recovery Procedure 3.
2 Go to the workstation earth and:
  • Turn workstation power OFF (small black switch on front panel inside door)
  • Wait 5 seconds
  • Turn workstation power ON

The system will now reboot.

3 Quit and re-launch Cleo Spectrometer.
4 Monitor the message window and Spectrometer to verify faults clear.

Recovery Procedure 3

STEP

ACTION
1 Go to the workstation earth and turn the workstation power OFF (small black switch on front panel inside door)
2 Go to the Spectrometer cabinet, VME rack, and press the RESET button on MVME167.
3 In the Spectrometer cabinet:
  • Toggle OFF System Power
  • Toggle OFF System Monitor
  • Wait 5 seconds
  • Toggle ON System Monitor
  • Toggle ON System power
4 Go to the workstation earth and turn the workstation power ON

The system will now reboot.

5 Quit and re-launch Cleo Spectrometer.
6 Monitor the message window and Spectrometer to verify faults clear.

-- DavidRose - 03 Oct 2003


Topic SpectrometerRecoveryProcedures . { View | Diffs | r1.20 | > | r1.19 | > | r1.18 | More }
Revision r1.1 - 03 Oct 2003 - 14:38 GMT - DavidRose
Revision r1.20 - 06 Feb 2008 - 17:19 GMT - DavidRose
Content copyright © 1999-2007 by the contributing authors.
All material on this collaboration platform is the property of the contributing authors.