NRAO Home  >  Green Bank  |  Wiki Topic:    GB > Software > LogCollatorUsrManual (r1.1 vs. r1.7)
   Changes | Index | Contents | Search | Statistics | Go
 <<O>>  Difference Topic LogCollatorUsrManual (r1.7 - 10 Oct 2007 - ToddHunter)
Added:
>
>

%META:FILEATTACHMENT{name="log.c" attr="" comment="instructions on getting ascii output (Todd)" date="1192038890" path="log.c" size="681" user="ToddHunter" version="1.1"}%


 <<O>>  Difference Topic LogCollatorUsrManual (r1.6 - 20 Feb 2007 - ToddHunter)
Added:
>
>

(Before it is officially released, to run this utility, you must first source /home/sparrow/integration/sparrow.bash or /home/sparrow/integration/sparrow.tcsh)


 <<O>>  Difference Topic LogCollatorUsrManual (r1.5 - 20 Feb 2007 - AmyShelton)
Changed:
<
<

Log Collator User Manual

>
>

Log Collator User Manual


TOC: No TOC in "Software.LogCollatorUsrManual"
Added:
>
>

Changed:
<
<

Example:

>
>

User-Defined Function Example


 <<O>>  Difference Topic LogCollatorUsrManual (r1.4 - 18 Feb 2007 - JoeBrandt)
Deleted:
<
<

Command Line Options and Flags

A number of command line options are available. Most options have both a long and short form. The minimal form of running the log_collator is: log_collator -i specificationfile -o outputfile
Added:
>
>

Command Line Options and Flags

The log_collator is a command-line only utility. A number of command line options are available. Options have both a long and short form. The minimal form of running the log_collator is:
    log_collator -i specificationfile -o outputfile [ options ]
Changed:
<
<

Example statements which use the 'as' clause to relabel the output data to unique names

>
>

Example statements which use the 'as' clause to relabel the output data to unique names in the output file

Changed:
<
<

Example statement which uses mean with a window size of 10 on all columns:

>
>

Example statement where all columns 'share' a single using statement. In this case all columns will be processed by the mean function with a window size of 10.

Changed:
<
<

# In the file "myavg.py"

>
>

In the file "myavg.py": # For NAN value definition import fpconst

Added:
>
>

If a value cannot be produced for some reason, then a not-a-number value should be returned instead.

Added:
>
>

if some_other_data_problem: return fpconst.NaN

Changed:
<
<

Do not use commas to separate multiple add x as y statements. The use of commas in this case seems to cause columns to be excluded from processing.

>
>

1. Do not use commas to separate multiple add x as y statements. The use of commas in this case seems to cause columns to be excluded from processing.

Added:
>
>

2. If all output data columns are NaN values (e.g. because the data was missing), the log_collator will omit the record in the output. This is easily changed, so I'd like input as to what should be done in this case.


 <<O>>  Difference Topic LogCollatorUsrManual (r1.3 - 16 Feb 2007 - JoeBrandt)
Changed:
<
<

The log collator is designed to take log data from multiple monitor points, and resample the log streams onto a common reference. The specification of file paths, data fields, and re-gridding method is expressed in a file using a english-like language syntax. User-written interpolation/processing functions may be added.

>
>

The log collator is designed to take log data from multiple monitor points, and resample the log streams onto a common reference. The specification of file paths, data fields, and re-gridding methods are expressed in a text file using a english-like syntax. A limited set of built-in processing functions are available. User-written interpolation/processing functions may be added using python.

Changed:
<
<

Command Line Flags

The general form of running the log_collator is:
>
>

Command Line Options and Flags

A number of command line options are available. Most options have both a long and short form. The minimal form of running the log_collator is:
Changed:
<
<

wThen default output file format is a binary format file. An additional file with a ".txt" extension is also produced to describe the column order and processing methods used to generate the data. When the --fits or -f flags are present, a FITS format file will be written. No description text file is produced.

>
>

The default output file format is a binary format file. When writing binary files, an additional file with a ".txt" extension is also produced to describe the column order and processing methods used to generate the data. When the --fits or -f flags are present, a FITS format file will be written, but no description text file is produced.


 <<O>>  Difference Topic LogCollatorUsrManual (r1.2 - 16 Feb 2007 - JoeBrandt)
Added:
>
>

wThen default output file format is a binary format file. An additional file with a ".txt" extension is also produced to describe the column order and processing methods used to generate the data. When the --fits or -f flags are present, a FITS format file will be written. No description text file is produced.

Changed:
<
<

There are two main types of statements. The first type appears as assignments to certain pre-defined values: Path, Starttime, Endtime, and Interval. For Example:

>
>

There are two main types of statements. The first type appears as assignments to certain pre-defined values: Path, Starttime, Endtime, and Interval. For Example:

Changed:
<
<

Normally Starttime, Endtime and Interval are set only once, but the Path keyword can be set multiple times.

>
>

Starttime, Endtime and Interval should only be set once, however the Path keyword can be re-set between with statements. Each assignment of Path may contain multiple directories separated by a colon. For example:

# Look in gbtlogs or the archive
Path=/home/gbtlogs:/home/archive
Changed:
<
<

with somedirname add fieldname using processing_function window size

>
>

with somedirname add fieldname using processing_function window size

Changed:
<
<

  • with somedirname - means to search the directory Path/_somedirname_ for files in the time range specified by the Start/End time range.
  • add fieldname - means to include the column in the processing
  • using processing_function - means to apply the processing_function in computing the values for the re-gridded data. See below for the list of built-in functions, and how to write user-defined functions.
  • window size - means that a range of data of length size will be provided to the processing function.
>
>

  • with somedirname - means to search the directory Path/somedirname for files in the time range specified by the Start/End time range.
  • add fieldname - means to include the column fieldname in the processing
  • using processing_function - means to apply the processing_function in computing the values for the re-gridded data. See below for the list of built-in functions, and how to write user-defined functions. Several columns may share a using clause.
  • window size - means that a range of data of length size will be provided to the processing function for each calculation.
Changed:
<
<

  • mean - computes the average value from the data window
  • median - computes the median value from the data window
  • linear - computes a linearly interpolated value from the data window (window size must be 2)
  • neighbor - finds the nearest data point to the time specified.
>
>

  • mean - computes the average value from the data window
  • median - computes the median value from the data window
  • linear - computes a linearly interpolated value from the data window (window size must be 2)
  • neighbor - finds the nearest data point to the time specified.

The using processing_function window size clause may be omitted (but not advised). In this case the log_collator will attempt to use some reasonable defaults:

  • If the time grid step is finer that the sampled data, then linear interpolation is used.
  • If the time grid is courser than the sampled data, and there are more than 5 data points per time grid step, use a median; otherwise use a mean.

Example specification statements:

Example statement which defaults the processing functions and window sizes:
   with Weather-Weather1-weather1 add WINDVEL, WINDDIR
Example statements which use the 'as' clause to relabel the output data to unique names
   with Accelerometer-Accelerometer1-AccelerometerData
        add X as X_1
        add Y as Y_1
        add Z as Z_1
   with Accelerometer-Accelerometer2-AccelerometerData
        add X as X_2
        add Y as Y_2
        add Z as Z_2
Example specifying different processing methods for each field:
   with Accelerometer-Accelerometer3-AccelerometerData
        add X as X_3 using mean window 3
        add Y as Y_3 using median window 5
        add Z as Z_3 using neighbor
Example statement which uses mean with a window size of 10 on all columns:
   with Accelerometer-Accelerometer3-AccelerometerData
        add X as X_3
        add Y as Y_3
        add Z as Z_3 using mean window 10
Changed:
<
<

Writing a user-defined routine is relatively easy, but the function must accept two args: time and a list of tuples of timestamps and data. For example a simple spline function:

>
>

A user-defined processing function must accept two args: a scalar time and a list of tuples of timestamps and data. For example to write a simple average function:

Changed:
<
<

for i in data: c += i[1]

>
>

for datapt in data: c += datapt[1] # 0==timetag, 1==data sample

Added:
>
>

Current Known Issues:

Do not use commas to separate multiple add x as y statements. The use of commas in this case seems to cause columns to be excluded from processing.
    Don't do this:
    with Weather-Weather1-weather1 add WINDVEL as windspeed , add WINDDIR as winddirection
    Do this instead:
    with Weather-Weather1-weather1 add WINDVEL as windspeed add WINDDIR as winddirection

 <<O>>  Difference Topic LogCollatorUsrManual (r1.1 - 16 Feb 2007 - JoeBrandt)
Added:
>
>

%META:TOPICINFO{author="JoeBrandt" date="1171591020" format="1.0" version="1.1"}% %META:TOPICPARENT{name="ModificationRequest3C107"}%

Log Collator User Manual

Introduction

The log collator is designed to take log data from multiple monitor points, and resample the log streams onto a common reference. The specification of file paths, data fields, and re-gridding method is expressed in a file using a english-like language syntax. User-written interpolation/processing functions may be added.

Command Line Flags

The general form of running the log_collator is: log_collator -i specificationfile -o outputfile

A number of additional options may be specified:

-i, --input Specifies the spec file to be read
-o, --output Specifies the name of the output data file
-f, --fits Specifies that a FITS format file should be written
-q, --quiet Inhibit processing progress messages
-c, --check Just parse the file and check directory paths
-t, --interval Overrides the interval set in the specification file
-s, --start Overrides the start time set in the specification file
-e, --end, --stop Overrides the stop time set in the specification file

Specification File Syntax

There are two main types of statements. The first type appears as assignments to certain pre-defined values: Path, Starttime, Endtime, and Interval. For Example:
Path=/home/gbtlogs
Starttime=2007-02-14 12:01:00
EndTime=2007-02-28 23:59:00
Interval=0.1
Normally Starttime, Endtime and Interval are set only once, but the Path keyword can be set multiple times.

The second type of statement involve a number of keywords:

    with somedirname add fieldname using processing_function window size
Where:
  • with somedirname - means to search the directory Path/_somedirname_ for files in the time range specified by the Start/End time range.
  • add fieldname - means to include the column in the processing
  • using processing_function - means to apply the processing_function in computing the values for the re-gridded data. See below for the list of built-in functions, and how to write user-defined functions.
  • window size - means that a range of data of length size will be provided to the processing function.

The following processing functions are pre-defined:

  • mean - computes the average value from the data window
  • median - computes the median value from the data window
  • linear - computes a linearly interpolated value from the data window (window size must be 2)
  • neighbor - finds the nearest data point to the time specified.

Writing User Defined Processing Functions

The log_collator allows the user to provide processing functions written in python. The name of the file should be given in an import statement directly in the specification file. (Note that it may be necessary to provide a full pathname to the user-defined file.

Example:

Writing a user-defined routine is relatively easy, but the function must accept two args: time and a list of tuples of timestamps and data. For example a simple spline function:
# In the file "myavg.py"
def myavg(t, data):
    """
    A function to compute a simple average to a few data points. 
    The time to be calculated is 't', and the data argument has the form of 
    a list of tuples of timestamps and data. The length of data will be what 
    was specified in the 'window' clause.
    """
    if len(data) < 1:
       raise "WindowSizeError"

    for i in data:
        c += i[1]

    return c/len(data)

In the specification file, prior to any 'with' statements, add an import of your .py file:

import myavg.py
# or alternatively:
import /users/joeastro/myfuncs/myavg.py

# To use myavg, specify it in a using clause:
with Weather-Weather1-weather1 add WINDVEL using myavg window 12
# That's it!

-- JoeBrandt - 16 Feb 2007

Topic LogCollatorUsrManual . { View | Diffs | r1.7 | > | r1.6 | > | r1.5 | More }
Revision r1.1 - 16 Feb 2007 - 01:57 GMT - JoeBrandt
Revision r1.7 - 10 Oct 2007 - 17:54 GMT - ToddHunter
Content copyright © 1999-2007 by the contributing authors.
All material on this collaboration platform is the property of the contributing authors.