earthscopestraintools.processing module

earthscopestraintools.processing.apply_Agnew_2007(df: DataFrame)

Filter and decimate 1hz data to 5 min data using

Agnew, Duncan Carr, and K. Hodgkinson (2007), Designing compact causal digital filters for low-frequency strainmeter data , Bulletin Of The Seismological Society Of America, 97, No. 1B, 91-99

Parameters:: df (pd.DataFrame) – 1 hz continuous data
Returns:: 300s (5 min) data
Return type:: pd.DataFrame

earthscopestraintools.processing.apply_calibration_matrix(df: DataFrame, calibration_matrix: array, calibration_matrix_name: str | None = None, use_channels: list = [1, 1, 1, 1])

Applies a calibration matrix to convert 4 gauges into areal, differential, and shear strains

Parameters:

df (pd.DataFrame) – four channels of strain data or correction in microstrain
calibration_matrix (np.array) – calibration matrix
calibration_matrix_name (str, optional) – name of calibration matrix used, defaults to None
use_channels (list, optional) – not yet implemented, set to 0 to ignore a bad channel, defaults to [1, 1, 1, 1]

Returns:

areal, differential, and shear strains based on the given calibration

Return type:

pd.DataFrame

earthscopestraintools.processing.baytap_analysis(df, atmp_df, quality_df=None, atmp_quality_df=None, latitude=None, longitude=None, elevation=None, dmin=0.001): This function runs BAYTAP08 locally if available, otherwise uses a docker container to run BAYTAP08 (Tamura 1991; Tamura and Agnew 2008) for tidal analysis. Time series (e.g. strain) and additional auxiliary input (e.g. pressure) are analyzed together to determine the amplitudes and phases of a combination of tidal constituents (M2, O1, P1, K1, N2, S2) in the time series, as well as a coefficient for the auxiliary input response. Please refer to the Baytap08 manual for more details, included suggested length of the time series for full detemination of the tidal constituent suite returned here (365+ days). :param df: DataFrame of timeseries with datetime index and one channel per column. Strain should be in microstrain, and pressure in hPa. :type df: pd.DataFrame :param atmp_df: DataFrame with atmospheric pressure data and datetime index :type atmp_df: pd.DataFrame :param quality_df: DataFrame with flags designating the quality of the data. Any points that are not good (g) are ignores in the time series analysis. :type quality_df: pd.DataFrame :param units: Units of strain, should match microstrain or nanostrain :type units: str :param atmp_quality_df: DataFrame with flags designating the quality of the pressure data. Any points that are not good (g) are ignores in the time series analysis. :type atmp_quality_df: pd.DataFrame :param atmp_units: Units of atmospheric pressure data, should be hpa :type atmp_units: str :param latitude: latitude of the station :type latitude: float :param longitude: longitude of the station :type longitude: float :param elevation: elevation of the station :type elevation: float :param dmin: Drift parameter for the program. Large drift expects a linear trend. Small drift allows for rapid changes in the residual time series. :type dmin: float :return: Dictionary of amplitudes and phases for each tidal constituent per gauge, and atmospheric pressure coefficient. :rtype: dict

earthscopestraintools.processing.butterworth_filter(df: DataFrame, period: float, filter_type: str, filter_order: int, filter_cutoff_s: float)

Apply a butterworth filter to a DataFrame using scipy.signal.butter()

Parameters:

df (pd.DataFrame) – data to filter
period (float) – sample period of data
filter_type (str) – {‘lowpass’, ‘highpass’, ‘bandpass’, ‘bandstop’}
filter_order (int) – the order of the filter
filter_cutoff_s (float) – the filter cutoff frequency in seconds

Returns:

butterworth filtered data

Return type:

pandas.DataFrame

earthscopestraintools.processing.calculate_double_exponential_trend_correction(df, detrend_params)

Use parameters from station metadata to calculate a double exponential trend correction. Only works on gauge data (there are no detrend parameters in the metadata for regional strains)

Parameters:

df (pd.DataFrame) – uncorrected data, as dataframe with datetime index and CH0-CH3 as columns
detrend_params (dictionary) – detrend_params dictionary loaded by GtsmMetadata module

Returns:

trend correction timeseries for CH0-CH3

Return type:

_type_

earthscopestraintools.processing.calculate_linear_trend_correction(df, method='linear', trend_start=None, trend_end=None)

Generate a linear trend correction via either a linear least squares calculation or a median trend calculation. The median trend calculation (based on MIDAS in Blewitt et al., 2016 for GNSS time series analysis) uses the median slope value from all points separated by roughly one lunar day, calculated after outliers beyond 2 median absolute deviations are removed. It will only work with > 3 days of data.

Parameters:

df (pd.DataFrame) – uncorrected data, as dataframe with datetime index and 1 channel per column
method (str, default is linear) – linear or median
trend_start (datetime.datetime, optional) – start of window to calculate trend, defaults to first_valid_index()
trend_end (datetime.datetime, optional) – end of window to calculate trend, defaults to last_valid_index()

Returns:

trend correction timeseries for each column/channel in input data

Return type:

pd.DataFrame

earthscopestraintools.processing.calculate_offsets(df, limit_multiplier: int = 10, cutoff_percentile: float = 0.75)

Calculate offsets using first differencing method (add more details).

Parameters:

df (pandas.DataFrame) – uncorrected data, as dataframe with datetime (in seconds) index and 1 channel per column
limit_multiplier (int, optional) – _description_, defaults to 10
cutoff_percentile (float, optional) – _description_, defaults to 0.75

Returns:

_description_

Return type:

_type_

earthscopestraintools.processing.calculate_pressure_correction(df: DataFrame, response_coefficients: dict)

Generate a pressure correction timeseries from pressure data and response coefficients

Parameters:

df (pd.DataFrame) – atmospheric pressure data
response_coefficients (dict) – response coefficients for each channel loaded from metadata

Returns:

pressure corrections for each channel

Return type:

pd.DataFrame

earthscopestraintools.processing.calculate_tide_correction(df, period, tidal_parameters, longitude)

Generate tidal correction timeseries using SPOTL hartid

Parameters:

df (pd.DataFrame) – uncorrected data, as dataframe with datetime index and 1 channel per column
period (int) – sample period of data, must be >= 1
tidal_parameters (dict) – tidal parameters loaded from station metadata
longitude (float) – station longitude

Returns:

tidal correction timeseries for each column/channel in input data

Return type:

pd.DataFrame

earthscopestraintools.processing.decimate_1s_to_300s(df: DataFrame, method: str = 'linear', limit: int = 3600)

Filter and decimate 1hz data to 5 min data using

Agnew, Duncan Carr, and K. Hodgkinson (2007), Designing compact causal digital filters for low-frequency strainmeter data , Bulletin Of The Seismological Society Of America, 97, No. 1B, 91-99

This function will interpolate gaps up to ‘limit’ samples. If gaps remain, it will break the dataframe into continuous chunks, apply the filter to each one, and then recombine into a single dataframe. Remaining gaps are filled with nans.

Parameters:

df (pd.DataFrame) – 1 hz data
method (str, optional) – method to interpolate across gaps, defaults to “linear”
limit (int, optional) – largest gap to interpolate, defaults to 3600 samples

Returns:

300s (5 min) data

Return type:

pd.DataFrame

earthscopestraintools.processing.decimate_to_hourly(df: DataFrame)

decimates a timeseries to hourly by selecting the first and second and minute of each hour

Parameters:: df (pd.DataFrame) – time series data to decimate
Returns:: decimated data
Return type:: pd.DataFrame

earthscopestraintools.processing.double_exponential_trend_model(t, F, A1, T1, M, A2, T2)

The model used to fit a double exponential trend

Parameters:

t (_type_) – _description_
F (_type_) – _description_
A1 (_type_) – _description_
T1 (_type_) – _description_
M (_type_) – _description_
A2 (_type_) – _description_
T2 (_type_) – _description_

Returns:

_description_

Return type:

_type_

earthscopestraintools.processing.get_eig(df: DataFrame): Tool to extract eigenvalues and azimuth’s (from north) from a timeseries with areal (Eee+Enn), differential (Eee-Enn), and engineering shear strain (2Een). :param df: dataframe with areal (Eee+Enn), differential (Eee-Enn), and engineering shear strain (2Een) columns :type df: pd.DataFrame :return: Amplitudes and azimuths for the two eigenvectors in order, amp1, az1, amp2, az2 :rtype: np.array

earthscopestraintools.processing.interpolate(df: DataFrame, replace: int = 999999, method: str = 'linear', limit: int = 3600, limit_direction='forward')

Interpolate across gaps in data using pd.DataFrame.interpolate()

Parameters:

df (pd.DataFrame) – data to be interpolated
replace (int, optional) – gap fill value to interpolate across, defaults to 999999
method (str, optional) – interpolation method, defaults to “linear”
limit (int, optional) – max number of samples to interpolate, defaults to 3600
limit_direction (str, optional) – [‘forward’, ‘backward’, ‘both’], defaults to “forward”

Returns:

interpolated data

Return type:

pd.DataFrame

earthscopestraintools.processing.is_continuous(df)

Method to determine if there are gaps in a dataframe

Parameters:: df (pd.DataFrame) – dataframe
Returns:: True if no gaps, False if there are gaps
Return type:: bool

earthscopestraintools.processing.linearize(df: DataFrame, reference_strains: dict, gap: float)

Linearize raw gauge strain.

Parameters:

df (pd.DataFrame) – Dataframe with four columns corresponding to raw gauge data in units of counts.
reference_strains (dict) – Dictionary with four entries noting the reference count value on each gauge to zero the timeseries against in the conversion to gauge strain (in microstrain).
gap (float) – Instrument measurement gap. For most NOTA strainmeters, the reference gap is 0.0001 m.

Returns:

DataFrame with with four columns of microstrain.

Return type:

pd.DataFrame

earthscopestraintools.processing.split_into_continuous_dataframes(df)

Method for breaking gappy data into multiple continuous dataframes. Used for decimation of 1s data to 300s.

Parameters:: df (pd.DataFrame) – non-continuous dataframe
Returns:: list of continuous dataframes
Return type:: list[pd.DataFrame]

earthscopestraintools.processing.spotl_predict_tides(latitude, longitude, elevation, glob_oc, reg_oc, greenf)

Returns the complex numbers (from amplitude and phase) for the predicted areal and shear strains using spotl

Expects regional model polygons to have already been constructed in the working directory (in this case, in the Docker container).

Parameters:

latitude (float) – Station latitude
longitude (float) – Station longitude
elevation (float) – Station elevation (m)
glob_oc (str) – Global ocean model from SPOTL. e.g. osu.tpxo72.2010
reg_oc (str) – Regional ocean model from SPOTL. e.g. osu.usawest.2010
greenf (str) – Green’s functions for the elastic earth structure from SPOTL. e.g. green.contap.std

Returns:

Areal, differential, and shear strain complex numbers for the M2 and O1 tides. (eEE+eNN)m2, (eEE-eNN)m2, (2EN)m2, (eEE+eNN)o1, (eEE-eNN)o1, (2EN)o1

Return type:

dict

earthscopestraintools.processing.start_df_at_300s(df)

Method to force 1hz data to start at a round 300s timestamp. Will throw away up to 299 s of data. Used for prepping data for 300s decimation.

Parameters:: df (pd.Dataframe) – dataframe containing 1 hz data
Returns:: dataframe containing 1 hz data
Return type:: pd.Dataframe

earthscopestraintools.processing.update_double_exponential_detrend_params(df, detrend_params)

method to recalculate the six detrend coeffiecients for each channel. As implemented, this iterates on the coeffiecients in the metadata. If there are none,

Parameters:

df (_type_) – _description_
detrend_params (_type_) – _description_

Returns:

_description_

Return type:

_type_