NRAO Home  >  Green Bank  |  Wiki Topic:    GB > Software > PlanOfRecordC22007 > ModificationRequest4C207
   Changes | Index | Contents | Search | Statistics | Go

Modify Peak Finding Algorithm for Pointing/Focus Code

Modification Request #4 (C02 2007)



1. Introduction

When processing "Peak" or "Focus" scans, the algorithm that GFM uses to find the first guess for the location of the peak seems to be very robust when doing peak scans with the standard scan length and rate. But if one uses very different sample rates or scan lengths, the algorithm sometimes fails to find the correct peak. Hence, we propose some changes to allow GFM to find a good first guess for the peak position under a wide variety of circumstances. Of course, it is important that the performance for standard Peak scans is not degraded.

This modification is not expected to handle all extreme cases: if there are fewer than 5 pixels across the profile, we may not necessarily expect to get a good fit.

2. Background

The present algorithm to find the first guess for the peak is to calculate the expected FWHM (full width at half-maximum) and look within +/- FWHM of the center of the scan. The point with the maximum value is used as the first guess of the peak. This location of this peak is used to partition the scan into the portions to be regarded as baseline and the portion to be used for fitting a Gaussian. The problem with this method is the peak will be missed if it is more than a FWHM from the center. This can happen either if the pointing error is larger than normal, or if the scanning rate is such that the beam profile is just a few pixels wide. Even if the point identified as the peak is not correct, but is such that the Gaussian curve fitting is successful, the range identified for baseline fitting may contain data that is more appropriately within the Gaussian region, and will thus skew the fitted baseline polynomial.

For standard Peak scans, the sample rate is 10 Hz, the scan duration is 30 seconds, and the scan length is chosen so that the expected profile is about 10% of the total scan; hence the profile covers about 30 pixels. But if there are only 10 pixels across the profile, the algorithm often fails, even though it should be possible to get a good fit. Are these defaults true for all receivers? My recollection is that PTCS has a table of these values per receiver and that is what Auto* uses.

3. Requirements

3.1 General Requirements

  1. The algorithm should be insensitive to RFI; hence, narrow spikes should not be mistakenly identified as the signal peak.

3.2 Algorithmic Requirements

The proposed new algorithm is as follows.

  1. Look for the maximum value within two FWHMs of the center.
  2. Check that the peak has approximately the right width; if not, then look for the next highest value and repeat the process.
  3. If no profile of the right width can be found, then look for the maximum value within the central 80% of the scan, check the width, and if necessary iterate the process.
  4. If no peak can be found with approximately the right width, then issue an error message and quit.

The method of checking for the right width is yet to be determined and will be the subject of experimentation during the design phase.

Figure 1 illustrates the difference between the current algorithm, and the algorithm that is proposed, for a case in which the peak would not have been identified with the current algorithm.


Figure_1_AlgorithmDifferences_Diagram.png

4. Design

To affect the required changes, the interface to existing peak finding functionality remains unchanged, but the underlying code will be modified to refactor the strategic component (check one range, then another) and width / profile determination component into new methods.

OffsetScan hosts the FindPeak() method, which finds peaks for all processed scans except those Azimuth scans which are dual-beamed. For those scans, AzimuthScan calls it's FindPeaks() method. AzimuthScan is subclassed from OffsetScan.

Module Summary of Modifications
OffsetScan Modify FindPeak() to reflect successive options for finding signal peak. Add method to test for signal-like width / profile.
AzimuthScan Modify FindPeaks() to make use of FindPeak() in parent OffsetScan

4.1 OffsetScan

Type Element Description
Modify FindPeak() Refactor search through scan data points; add method to verify possible peak width / profile.
NEW FindPeakInRange (idxRange, dir) Searches range of scan's integrations (identified by idxRange) for signal peak, pointing in dir (upward, downward) direction
NEW HasFWHMProfile (candidatePeakIdx, dir) Examines data surrounding candidate signal peak to determine if it has a width / profile indicative of Gaussian-like signal

4.1.1 FindPeak (fwhm, x, y, (dir))

The peak-finding portion of FindPeak() is refactored to two (2) new methods, FindPeakInRange() and HasFWHMProfile(). FindPeak() now contains the strategic level functionality, calling FindPeakInRange() successively with the algorithmic range choices stated in the requirements. A new optional argument is added to this method, dir, an enumerated string (in practice) with values 'Normal', 'Positive', and nominally 'Negative'. dir directs the core peak finding method to search for a single, upward-pointing peak ('Normal'), an upward-pointing peak in a dual-peak scan ('Positive'), or a downward-pointing peak in a dual-peak scan ('Negative'). If dir is neither 'Normal' or 'Positive', it's treated as if (but not explicitly checked) it were 'Negative'.

4.1.2 FindPeakInRange (idxRange, fwhm, x, y, dir)

Searches for the signal peak within the idxRange range of integration data points on either side of the center data point of the scan. Examines data points in that range in decreasing order (increasing order for 'Negative' peaks), successively checking each to see if it has the appropriate width / profile by calling the new method HasFWHMProfile(), until it finds one that does, or exhausts the range.

4.1.3 HasFWHMProfile (candidatePeakIdx, fwhm, x, y, dir)

Examines the data points on either side of candidateIdxPeak, to see if they approximately conform to a Gaussian-like curve of the appropriate width (expected width based on the center frequency). As stated in the Requirements section, the actual criteria used to determine if the data "approximately conform" to a Gaussian-like curve is an area of investigation for this MR. While convolution of the actual data with a Gaussian function characterized by the expected width would likely identify the peak, we are searching for a less resource-intensive solution in terms of development.

An initial algorithm with promise for verifying appropriate width and profile is described by the following set of conditions:

where the segments are defined according to the diagram in Figure 2 below. The relations of values and mean values between the segments and points helps ensure a peak-like profile; the set of points defining the segments helps ensure it is a profile of the appropriate width.

Figure_2_HasFWHMProfile_Diagram.png

Throughout testing on multiple datasets, these conditions tend to select only valid signal peaks, and have yet to falsely identify an RFI spike.

4.2 AzimuthScan

Type Element Description
Modify FindPeaks() Invoke the newly modified OffsetScan.FindPeak() method of the parent module.

4.2.1 FindPeaks (fwhm, x, y)

The interface to this method remains unchanged. The current algorithm makes no use of OffsetScan.FindPeak() method; instead, it simply identifies the index of the largest y-value as the 'positive' peak, and the smallest (may be negative) y-value as the 'negative' peak. Such a choice is highly vulnerable to RFI, more so than the current OffsetScan.FindPeak() method.

As modified in this MR, FindPeaks() now makes two calls to the newly modified OffsetScan.FindPeak() method: one, with a dir parameter of 'Positive'; second, with a dir parameter of 'Negative'.

5. Deployment Checklist

6. Test Plan

Testing should have as its primary goal to confirm that the new algorithm successfully selects signal peaks from scans that are missed with the current algorithm. Two strong secondary goals are that the new algorithm does not select 'false positives' in previous datasets, or incorrectly select RFI spikes as signal peaks.

Most of the testing required by this MR can be successfully achieved in offline modes of the affected software. Specifically, re-analyzing old datasets in GFM (clicking each scan) will allow visual confirmation that the new algorithm correctly identifies the signal peak, as evidenced by the fitted Gaussian curve.

6.1 Internal Testing

Existing unit tests which are directly affected by these changes will be reviewed, and modified if necessary. These include unit tests for OffsetScan and AzimuthScan.

New unit tests will be created and added to the unit test suites for OffsetScan and AzimuthScan, to test OffsetScan.FindPeak() and AzimuthScan.FindPeaks() at the top-level, but also the new methods OffsetScan.FindPeakInRange() and OffsetScan.HasFWHMProfile().

Comparative analyses will be performed, wherein the code will be run with old algorithm, then again with new algorithm on the same dataset. The resulting scan fit characteristics and corresponding identified peaks will be compared to determine if the new algorithm improves without false positives over the old algorithm.

Some datasets identified for testing are:

Additional testing will be carried out in GFM, by re-analyzing old datasets in 'offline' mode, and verifying visually that the algorithm has selected the signal peak, as manifest by the fitted Gaussian curve peak matching the apparent signal peak.

6.2 Sponsor Testing

We will run peak scans using both single and dual beam receivers; sample rates of 5 to 50 Hz; scans of different lengths; scans with as few as 4 points in the peak; variations in ratio of peak width to baseline width.

6.3 Integration/Regression Tests

Integration and regressions tests should involve collecting a variety of scans, with visual verification in GFM plots that the signal peak is correctly identified, as determined by a fitted Gaussian curve whose peak appears to match the data. Perform scans that return various 'shapes' - single peak, double-peaks. Perform scans that will show noisier data, to verify the new algorithm isn't overly sensitive to mini-'peaks' in the noise to the detriment of the overall signal peak.

PointingRunV4.sb: The observing script which triggered the problem is linked here for integration/regression testing.


Signatures

APPROVED: I acknowledge that my request is fully contained in this MR, and if the SDD delivers exactly what I specified, I will be happy.

ACCEPTED: I acknowledge that I have validated the completed code according to the acceptance tests, and I am happy with the results.

Written DONE - RonGrider - 2007 April 17
Checked DONE - AmyShelton - 18 April 2007
Approved by Sponsor DONE - FrankGhigo - 19 April 2007
Approved by CCC DONE - RonMaddalena - 4 May 2007
Accepted/Delivered by Sponsor symbol - name - date

Symbols:


CCC Discussion Area

Attachment: sort Action: Size: Date: Who: Comment:
PointingRunV4.sb action 4877 18 Apr 2007 - 15:27 RonGrider Observing script for fast scan rate scans

Topic ModificationRequest4C207 . { Edit | Attach | Ref-By | Printable | Diffs | r1.13 | > | r1.12 | > | r1.11 | More }
Revision r1.13 - 05 May 2007 - 00:17 GMT - RonMaddalena
Parents: PlanOfRecordC22007
Content copyright © 1999-2007 by the contributing authors.
All material on this collaboration platform is the property of the contributing authors.