newfile
data file management
SYNOPSIS
newfile [-c] [-u] | file_name [current_scan_number]]
DESCRIPTION
The newfile macro is used to initialize a new data file or reopen an existing data file. Optional arguments are the data file name or template and the current scan number. If no file name argument is given, the macro prompts for one along with the current scan number.
A -c argument (for "continue") means if the file already exists, the scan number will be set from the last scan in that file, even if a different scan number is included on the command line.
A -u argument means update the file name using the current template. For example, if the template includes the date and the day has changed, a new file will be opened. There will be no prompt for a file name.
If the standard global variable DATA_DIR contains a valid path name and if file_name does not begin with a / or a ./, the data file will be opened in the path specified by DATA_DIR. Otherwise the data file will be opened using the path specified in file_name relative to the current directory. The default value for DATA_DIR is ./data. Directories will be created, if possible, as needed.
If the file name is null, /dev/null is used. Any data written to /dev/null will disappear.
As of spec release 6.12.01, the file_name argument is a template that can contain special conversion sequences. The newfile macro will expand such sequences to create the actual data file name using the file_from_template() macro, described below in the Templates section. The global variable DATAFILE_TEMPLATE contains the file name template. The global variable DATAFILE contains the expanded name.
Note, the standard macros will not update DATAFILE from the template if, for example, the date changes during a session. A call to newfile is necessary to generate an updated file name, although no new parameters need to be entered if using a template. It is sufficient to use newfile -u.
The next scan written to the data file will have scan number current_scan_number + 1. newfile assigns the global variable SCAN_N the value of current_scan_number.
When opening or re-opening an existing data file, the newfile macro does a few checks for consistency. If the current scan number (SCAN_N) doesn't match the last scan number of the file, the newfile macro will prompt whether to assign SCAN_N to continue the existing file scan numbering. If invoked with the -c flag, SCAN_N will be assigned the last scan number from the file without additional prompting.
If an existing file doesn't start with the expected #F header and if no #F header appears later in the file, the newfile macro will print a warning and append the standard header that includes the #F.
If duplicate scan numbers are found in an existing file, the newfile macro will print a warning but will not do anything about it.
STANDARD spec DATA FILE CONVENTIONS
The standard spec data file is ASCII and contains three types of headers and scan data. Header lines have a # in the first column. The first type of header is the file header, which contains four or more lines as shown:
#F /tmp/data_2023-08-14.dat #E 1692046973 #D Mon Aug 14 17:02:53 2023 #C fourc User = gerry
The #F line contains the pathname of the data file as it was created. The #E line contains the epoch value in seconds, that is, the system time when the file was created, which is also assigned to the EPOCH global variable. Subsequent data lines contain a column labeled EPOCH that is the difference between the epoch at the time of the data point and the value associated with the #E line. Saving the difference makes for smaller numbers. The #D line is the date and time when the header was written to the file. The date format is determined by the "date_format" option to spec_par(). The #C line is a comment line that contains the value of the TITLE and USER global variables.
The second header type contains names and mnemonics for the configured motors and counters, with eight of each per line, as in:
#O0 Two Theta Theta Chi Phi Slit Gap Slit Off Slit Up Slit Down #O1 SS Hor Gap SS Hor Off Ktheta Kappa Kphi Horizontal #o0 tth th chi phi sl2g sl2o sl2t sl2b #o1 shg sho kth kap kphi hor #J0 Seconds CHAN 1 CHAN 2 CHAN 3 #j0 sec mon det c3
The #O lines contain the motor names, each separated by two spaces (to accommodate motor names that contain space characters). The #o lines are the corresponding motor mnemonics. The #J and #j contain names and mnemonics for counters. All motors and counters are shown, except those named "unused".
The third header type is associated with each scan. It looks like the following:
#S 16 ascan tth 119 121 3 0.1 #D Sun 27 Aug 16:21:18 2023 #T 0.1 (Seconds) #G0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 #G1 1.54 1.54 1.54 90 90 90 4.079990459 4.079990459 [ ... ] #G3 4.079990459 2.498273628e-16 2.498273628e-16 0 [ ... ] #G4 0.285690376 0.285690376 0.2856904852 5.390450407 [ ... ] #Q 0.28569 0.28569 0.28569 #P0 120 60 -35.2644 -45 0 -4.35 -4.35 4.35 #P1 1 0.0002 -73.097504 -2.1919999 8.033 79.3 #@MCA 0 MCA_DATA_12 1 4096 ulong SIM_MCA #N 9 #L Two Theta H K L Epoch Seconds CHAN 3 CHAN 1 CHAN 2
(Note, the "#G" arrays above have been truncated for display purposes.)
The #S lines starts the scan header and is always preceded by a blank line. The #D line contains the date and time at the start of the scan. The third line has either a #T or #M depending on whether counting to time or monitor counts. If time, the tag is followed by the count time in seconds and the name of the sec counter. If counting to monitor, the monitor preset and counter name are shown.
Next, if the global variables G, U, UB and/or Q exist, as they do for geometries that employ reciprocal space calculations, the contents of the associated arrays are saved with tags #G0, #G1, #G3 and #G4 respectively (#G2 is not used). Those tags are followed by a #Q tag which contains just the current reciprocal space position (which duplicates the first three values of the #G4 tag).
The meaning of the elements of the geometry parameter arrays G (#G0) and Q (#G4) are specific to the diffractometer type and can found in the associated C code and/or macro file, for example, geo_fourc.c and macros/fourc.src for the four-circle geometry. The U (#G1) array contains these values:
U[0] a lattice constant (real space)U[1] b lattice constant (real space)U[2] c lattice constant (real space)U[3] Alpha lattice angle (real space)U[4] Beta lattice angle (real space)U[5] Gamma lattice angle (real space)U[6] a lattice constant (reciprocal space)U[7] b lattice constant (reciprocal space)U[8] c lattice constant (reciprocal space)U[9] Alpha lattice angle (reciprocal space)U[10] Beta lattice angle (reciprocal space)U[11] Gamma lattice angle (reciprocal space)U[12] H of primary reflectionU[13] K of primary reflectionU[14] L of primary reflectionU[15] H of secondary reflectionU[16] K of secondary reflectionU[17] L of secondary reflectionU[18] Angle of primary reflectionU[19] "U[20] "U[21] "U[22] "U[23] "U[24] Angle of secondary reflectionU[25] "U[26] "U[27] "U[28] "U[29] "U[30] Lambda when or0 was setU[31] Lambda when or1 was setU[32] Additional angle for primary reflectionU[33] Additional angle for secondary reflection
The UB (#G3) values are the consecutive rows of the 3x3 orientation matrix.
The #P headers that follow contain the current positions of all the configured motors, corresponding to the #O/#o headers.
The #@MCA and #@IMG headers assign an ID number for MCA or image data that follow within the scan and also describe the size and parameters of the associated data arrays, if using such detectors. The spectra will be saved to the main data file or to different files, depending on configuration options.
The above standard elements may be followed by locally defined tags generated by the Fheader and/or user_Fheader macros.
The last two items of the scan header are the #N number of columns and the #L column labels. For the latter, each label is separated by two spaces to accommodate labels with an embedded space. The first column is always the primary scanned item. The last column is always the counter assigned to DET. The second to last column is the counter assigned to MON when counting to time. When counting to monitor the second to last column is the counter sec. Default scan plots use the first column as the independent variable and the last column as the dependent variable.
Data follows the scan header as space separated number values. The first row of a data array has an @ in the first column with the number of rows and columns of such arrays determined by the prior header lines #MCA or #IMG
spec DATA FILE TAGS
The following tags are defined for the standard spec data file for header information and additional notes or comments. Not all are used by the standard macros, but are defined for possible local use.
- #C text
- Comment line with arbitrary text.
- #D date
- Current date and time using the current date format. The default format is "%a %b %d %T %Y", which yields output such as "Sat Feb 14 22:03:21 2026". Use the "date_format" option of spec_par() to change. See the spec_par help file.
- #E num
- The UNIX epoch (number of seconds since 00:00 GMT 1/1/1970).
- #F name
- The name by which the file was created.
- #G0 ...
- Geometry parameters from G[] array (mode, sector, etc.).
- #G1 ...
- Geometry parameters from U[] array (lattice constants, orientation reflections).
- #G3 ...
- Geometry parameters from UB[] array (orientation matrix).
- #G4 ...
- Geometry parameters from Q[] array (lambda, frozen angles, cut points, etc.).
- #I num
- *Normalizing factor to apply to the data.
- #IMG id arr scan:point rows cols type
- Array data associated with MCA index id follows. See #@IMG below.
- #IMG id arr scan:point filename
- Array data associated with MCA index id is in the separate file named filename. See #@IMG below.
- #j% ...
- Mnemonics of counter (% = 0, 1, 2, ... with eight counters per row).
- #J% ...
- Names of counters (each separated by two spaces).
- #L s1 ...
- Labels for the data columns (each separated by two spaces).
- #M num
- Data was counted to this many monitor counts.
- #MCA id arr scan:point rows cols type
- Array data associated with MCA index id follows. See #@MCA below.
- #MCA id arr scan:point filename
- Array data associated with MCA index id is in the separate file named filename. See #@MCA below.
- #N num [num2]
- Number of columns of data [ num2 sets per row - not used by the standard macros].
- #o% ...
- Mnemonics of motors (% = 0, 1, 2, ... with eight motors per row).
- #O% ...
- Names of motors (each separated by two spaces).
- #P% ...
- Positions of motors corresponding to above #O/#o.
- #Q *H K L*
- A reciprocal space position (H K L).
- #R
- *User-defined results from a scan.
- #S num HEADING
- Scan number followed by the string value of the HEADING global variable, which usually contains the scan command.
- #T num
- Data was counted for this many seconds.
- #U
- *User defined.
- #X
- *A temperature.
- #@MCA id arr cols type dev
- Parameters for the MCA array data tagged with id that follow in the scan data. The parameters are array name, number of columns, data type, and device type from the config file. See #MCA above.
- #@IMG id arr rows cols fmt dev
- Parameters for the image array data tagged with id that follow in the scan data. The parameters are array name, number of rows, number of columns, data type, and device type from the config file. See #IMG above.
- #@CALIB *a b c*
- *Coefficients for x[i] = a + b * i + c * i * i for MCA data
- #@CHANN *n f l r*
- *MCA channel information (number, first, last, reduction coef)
- #@CTIME *p l r*
- *MCA count times (preset, elapsed live, elapsed real)
- #@ROI *n f l*
- *MCA ROI channel information (ROI_name, first, last)
* Not written by the standard spec macros
TEMPLATES
The file_from_template() macro allows dynamic file name creation based on several spec variables and the system time. The first argument to the macro is the template. The second argument is the time used for the date/time based conversions.
In addition to the conversion sequences associated with the C library strftime() function, these additional conversions are supported:
%D* - The current data file path name minus any extension%d* - The current data file name minus path and extension%P* - The current data file directory%S* - The current scan number padded with zeros to 3 digits%#S* - As %S*, but padded to # digits%N* - The current point number%#N* - The current point number padded with zeros to # digits%U* - The value of the USER global variable%s* - The value of the SPEC global variable%H* - The value of the HOSTNAME global (if dots, the bit prior to the first dot)%#A* - The value of arg[#] in the call of file_from_template()
Note, the the acq (acquisition) macros for MCA and image data call file_from_template() with the data array name as the third argument, so %3A* will expand to that array name. See the acq help file.
Type "man strftime" to see the many possible conversions for date and time. Examples include:
%H - The hour (24-hour clock) as a decimal number (00-23)%M - The minute as a decimal number (00-59)%S - The second as a decimal number (00-60)%R - Equivalent to %H:%M%d - The day of the month as a decimal number (01-31)%m - The month as a decimal number (01-12)%Y - The year with century as a decimal number%j - The day of the year as a decimal number (001-366)%F - Equivalent to %Y-%m-%d
ALTERNATIVE DATA FILE FORMATS
The standard macros filter all the output to DATAFILE through three data file macro functions: df_print(), df_array() and df_flush() (since spec release 6.10.06). Custom output can be implemented by assigning a string to the global variable DFILE_HOW. If defined, the above macros will call custom macros. For example, if DFILE_HOW is set to the string "custom", the functions called will be df_print_custom(), df_array_custom() and df_flush_custom(). The behavior can be seen looking at the standard macros definitions:
def df_print() '{
local s
s = sprintf(arg1, arg2, arg3, arg4, arg5, arg6, arg7, arg8, arg9, arg10)
if (!DFILE_HOW) {
fprintf(DATAFILE, s)
return
}
eval(sprintf("df_print_%s(s)", DFILE_HOW), "local")
}'
def df_array(arr, fmt, name) '{
if (!DFILE_HOW) {
array_dump(DATAFILE, arr, fmt)
return
}
eval(sprintf("df_array_%s(arr, fmt, name)", DFILE_HOW), "local")
}'
def df_flush() '{
if (DFILE_HOW)
eval(sprintf("df_flush_%s()", DFILE_HOW))
}'
In the standard macros, df_print() is called for each line that is printed to DATAFILE, whether a header line or a data line. The df_array() macro function is called to write out data arrays. The df_flush() function is called within the standard _tail macro at the end of a scan, although there is no action associated with the call when using the standard data file output.
The standard macros will always call df_print() with a complete line of output. The strings for header lines always begin with a #, except for the #S scan header lines which begin with a newline: \n#S. As can be seen above, df_print() will be called with no more than ten arguments.
The acq macros call df_array() from the _save_acqdev() macro if acquisition is configured for MCA or image array to be saved in DATAFILE. Currently, if acquisition is configured to use separate files for the array data, the save_spectra() macro is called and that macro writes directly to the separate files, bypassing the data file macros above, since the df_* macros have DATAFILE built in and don't accommodate alternate file names.
Alternative output formats can be implemented by defining suitable macros.
GLOBALS
- DATA_DIR
- The name of a directory in which to create data files. The default value is ./data, that is, a directory named data in the current working directory. Note, though, if the name in DATAFILE begins with a "/" or "./", DATA_DIR is not used. Directories are created if they don't exist.
- DATAFILE_TEMPLATE
- A template for the data file name that may contain conversion sequences, as described in TEMPLATES, above. Does not include DATA_DIR.
- DATAFILE
- Global variable holding the name of the current data file, formed from DATA_DIR/DATAFILE_TEMPLATE with the conversion sequences expanded. Its value is written in the data file header preceded by #F.
- DATAFILE_HEAD1
- A copy of the motor and counter header info.
- SCAN_N
- Global variable holding the current scan number. Its value is written in the data file scan headers preceded by #S.
- EPOCH
- Constant variable holding the time at which the data file was opened. Its value is written in the data file header preceded by #E. The UNIX epoch is the number of seconds from 00:00 GMT 1/1/1970.
- TITLE
- Global variable written as a comment in the data file header. Its value is an arbitrary string set by the user.
MACROS
- datafile_head1()
- Creates a long string to contain the motor and counter names and mnemonics. The string is written in the data file header. A copy of the string is saved in the global variable DATAFILE_HEAD1. The standard _head macro, called at the start of all the standard scans, calls chk_datafile_head. The chk_datafile_head macro compares a newly generated version of the string to the current version in DATAFILE_HEAD1 to see if the motor or counter configuration has changed. If the configuration has changed, chk_datafile_head writes the modified header string to the data file. Both motor and counter information is written, even if only one or the other has changed, to keep the macros simple.
- chk_datafile_head
- Generates a string using datafile_head1() containing motor and counter names and mnemonics, and writes a new header to the data file if the information has changed since the last time the header was written, or if the configuration option to write the header for each scan has been selected.
- write_datafile_head()
- Writes the datafile header one line at a time using df_printf().
- newfile_head()
- Sets the EPOCH variable to the current time if SCAN_N equals zero. Writes the standard data file header.
- newfile
- The main macro for setting the data file name or template. Can be invoked with file name/template and optional scan number arguments. Without arguments, prompts for values. Calls newfile_f(). Optional argument -c to automatically assign SCAN_N using the last scan number in an existing file. Optional argument -u to update DATAFILE from DATAFILE_TEMPLATE using the current time and parameters.
- newfile_f()
- A macro function that does most of the work of setting up the DATAFILE and SCAN_N.
- file_from_template(template, time)
- Returns a new string based on the template in template and the epoch value in time. The template conversions are described in TEMPLATES above.
- user_filehead
- An optional cdef()-defined local macro containing items for the data file header.
- user_filehead_func()
- A wrapper for user_filehead to protect local variables.
- user_newfile
- An optional cdef()-defined local macro called at the of newfile_f() to do whatever local actions may be needed when a new data file is opened.
- user_newfile_func()
- A wrapper for user_newfile to protect local variables.
- user_filecheck(s)
- A user definable macro to massage or test the data file name. The argument s is the data file name created by newfile. The function should return the file name to use, which may be s or the null string "" if the name is invalid.
- df_print()
- All standard macro output to DATAFILE is passed through this macro line by line.
- df_array()
- Standard macro output of data arrays to DATAFILE is passed through df_array().
- df_flush()
- Called by the standard _tail macro to finish up output when a scan is finished.
