earthscopestraintools.processing module
- earthscopestraintools.processing.apply_Agnew_2007(df: DataFrame)
Filter and decimate 1hz data to 5 min data using
Agnew, Duncan Carr, and K. Hodgkinson (2007), Designing compact causal digital filters for low-frequency strainmeter data , Bulletin Of The Seismological Society Of America, 97, No. 1B, 91-99
- Parameters:
df (pd.DataFrame) – 1 hz continuous data
- Returns:
300s (5 min) data
- Return type:
pd.DataFrame
- earthscopestraintools.processing.apply_calibration_matrix(df: DataFrame, calibration_matrix: array, calibration_matrix_name: str | None = None, use_channels: list = [1, 1, 1, 1])
Applies a calibration matrix to convert 4 gauges into areal, differential, and shear strains
- Parameters:
df (pd.DataFrame) – four channels of strain data or correction in microstrain
calibration_matrix (np.array) – calibration matrix
calibration_matrix_name (str, optional) – name of calibration matrix used, defaults to None
use_channels (list, optional) – not yet implemented, set to 0 to ignore a bad channel, defaults to [1, 1, 1, 1]
- Returns:
areal, differential, and shear strains based on the given calibration
- Return type:
pd.DataFrame
- earthscopestraintools.processing.baytap_analysis(df, atmp_df, quality_df=None, atmp_quality_df=None, latitude=None, longitude=None, elevation=None, dmin=0.001)
This function runs BAYTAP08 locally if available, otherwise uses a docker container to run BAYTAP08 (Tamura 1991; Tamura and Agnew 2008) for tidal analysis. Time series (e.g. strain) and additional auxiliary input (e.g. pressure) are analyzed together to determine the amplitudes and phases of a combination of tidal constituents (M2, O1, P1, K1, N2, S2) in the time series, as well as a coefficient for the auxiliary input response. Please refer to the Baytap08 manual for more details, included suggested length of the time series for full detemination of the tidal constituent suite returned here (365+ days). :param df: DataFrame of timeseries with datetime index and one channel per column. Strain should be in microstrain, and pressure in hPa. :type df: pd.DataFrame :param atmp_df: DataFrame with atmospheric pressure data and datetime index :type atmp_df: pd.DataFrame :param quality_df: DataFrame with flags designating the quality of the data. Any points that are not good (g) are ignores in the time series analysis. :type quality_df: pd.DataFrame :param units: Units of strain, should match microstrain or nanostrain :type units: str :param atmp_quality_df: DataFrame with flags designating the quality of the pressure data. Any points that are not good (g) are ignores in the time series analysis. :type atmp_quality_df: pd.DataFrame :param atmp_units: Units of atmospheric pressure data, should be hpa :type atmp_units: str :param latitude: latitude of the station :type latitude: float :param longitude: longitude of the station :type longitude: float :param elevation: elevation of the station :type elevation: float :param dmin: Drift parameter for the program. Large drift expects a linear trend. Small drift allows for rapid changes in the residual time series. :type dmin: float :return: Dictionary of amplitudes and phases for each tidal constituent per gauge, and atmospheric pressure coefficient. :rtype: dict
- earthscopestraintools.processing.butterworth_filter(df: DataFrame, period: float, filter_type: str, filter_order: int, filter_cutoff_s: float)
Apply a butterworth filter to a DataFrame using scipy.signal.butter()
- Parameters:
df (pd.DataFrame) – data to filter
period (float) – sample period of data
filter_type (str) – {‘lowpass’, ‘highpass’, ‘bandpass’, ‘bandstop’}
filter_order (int) – the order of the filter
filter_cutoff_s (float) – the filter cutoff frequency in seconds
- Returns:
butterworth filtered data
- Return type:
pandas.DataFrame
- earthscopestraintools.processing.calculate_double_exponential_trend_correction(df, detrend_params)
Use parameters from station metadata to calculate a double exponential trend correction. Only works on gauge data (there are no detrend parameters in the metadata for regional strains)
- Parameters:
df (pd.DataFrame) – uncorrected data, as dataframe with datetime index and CH0-CH3 as columns
detrend_params (dictionary) – detrend_params dictionary loaded by GtsmMetadata module
- Returns:
trend correction timeseries for CH0-CH3
- Return type:
_type_
- earthscopestraintools.processing.calculate_linear_trend_correction(df, method='linear', trend_start=None, trend_end=None)
Generate a linear trend correction via either a linear least squares calculation or a median trend calculation. The median trend calculation (based on MIDAS in Blewitt et al., 2016 for GNSS time series analysis) uses the median slope value from all points separated by roughly one lunar day, calculated after outliers beyond 2 median absolute deviations are removed. It will only work with > 3 days of data.
- Parameters:
df (pd.DataFrame) – uncorrected data, as dataframe with datetime index and 1 channel per column
method (str, default is linear) – linear or median
trend_start (datetime.datetime, optional) – start of window to calculate trend, defaults to first_valid_index()
trend_end (datetime.datetime, optional) – end of window to calculate trend, defaults to last_valid_index()
- Returns:
trend correction timeseries for each column/channel in input data
- Return type:
pd.DataFrame
- earthscopestraintools.processing.calculate_offsets(df, limit_multiplier: int = 10, cutoff_percentile: float = 0.75)
Calculate offsets using first differencing method (add more details).
- Parameters:
df (pandas.DataFrame) – uncorrected data, as dataframe with datetime (in seconds) index and 1 channel per column
limit_multiplier (int, optional) – _description_, defaults to 10
cutoff_percentile (float, optional) – _description_, defaults to 0.75
- Returns:
_description_
- Return type:
_type_
- earthscopestraintools.processing.calculate_pressure_correction(df: DataFrame, response_coefficients: dict)
Generate a pressure correction timeseries from pressure data and response coefficients
- Parameters:
df (pd.DataFrame) – atmospheric pressure data
response_coefficients (dict) – response coefficients for each channel loaded from metadata
- Returns:
pressure corrections for each channel
- Return type:
pd.DataFrame
- earthscopestraintools.processing.calculate_tide_correction(df, period, tidal_parameters, longitude)
Generate tidal correction timeseries using SPOTL hartid
- Parameters:
df (pd.DataFrame) – uncorrected data, as dataframe with datetime index and 1 channel per column
period (int) – sample period of data, must be >= 1
tidal_parameters (dict) – tidal parameters loaded from station metadata
longitude (float) – station longitude
- Returns:
tidal correction timeseries for each column/channel in input data
- Return type:
pd.DataFrame
- earthscopestraintools.processing.decimate_1s_to_300s(df: DataFrame, method: str = 'linear', limit: int = 3600)
Filter and decimate 1hz data to 5 min data using
Agnew, Duncan Carr, and K. Hodgkinson (2007), Designing compact causal digital filters for low-frequency strainmeter data , Bulletin Of The Seismological Society Of America, 97, No. 1B, 91-99
This function will interpolate gaps up to ‘limit’ samples. If gaps remain, it will break the dataframe into continuous chunks, apply the filter to each one, and then recombine into a single dataframe. Remaining gaps are filled with nans.
- Parameters:
df (pd.DataFrame) – 1 hz data
method (str, optional) – method to interpolate across gaps, defaults to “linear”
limit (int, optional) – largest gap to interpolate, defaults to 3600 samples
- Returns:
300s (5 min) data
- Return type:
pd.DataFrame
- earthscopestraintools.processing.decimate_to_hourly(df: DataFrame)
decimates a timeseries to hourly by selecting the first and second and minute of each hour
- Parameters:
df (pd.DataFrame) – time series data to decimate
- Returns:
decimated data
- Return type:
pd.DataFrame
- earthscopestraintools.processing.double_exponential_trend_model(t, F, A1, T1, M, A2, T2)
The model used to fit a double exponential trend
- Parameters:
t (_type_) – _description_
F (_type_) – _description_
A1 (_type_) – _description_
T1 (_type_) – _description_
M (_type_) – _description_
A2 (_type_) – _description_
T2 (_type_) – _description_
- Returns:
_description_
- Return type:
_type_
- earthscopestraintools.processing.get_eig(df: DataFrame)
Tool to extract eigenvalues and azimuth’s (from north) from a timeseries with areal (Eee+Enn), differential (Eee-Enn), and engineering shear strain (2Een). :param df: dataframe with areal (Eee+Enn), differential (Eee-Enn), and engineering shear strain (2Een) columns :type df: pd.DataFrame :return: Amplitudes and azimuths for the two eigenvectors in order, amp1, az1, amp2, az2 :rtype: np.array
- earthscopestraintools.processing.interpolate(df: DataFrame, replace: int = 999999, method: str = 'linear', limit: int = 3600, limit_direction='forward')
Interpolate across gaps in data using pd.DataFrame.interpolate()
- Parameters:
df (pd.DataFrame) – data to be interpolated
replace (int, optional) – gap fill value to interpolate across, defaults to 999999
method (str, optional) – interpolation method, defaults to “linear”
limit (int, optional) – max number of samples to interpolate, defaults to 3600
limit_direction (str, optional) – [‘forward’, ‘backward’, ‘both’], defaults to “forward”
- Returns:
interpolated data
- Return type:
pd.DataFrame
- earthscopestraintools.processing.is_continuous(df)
Method to determine if there are gaps in a dataframe
- Parameters:
df (pd.DataFrame) – dataframe
- Returns:
True if no gaps, False if there are gaps
- Return type:
bool
- earthscopestraintools.processing.linearize(df: DataFrame, reference_strains: dict, gap: float)
Linearize raw gauge strain.
- Parameters:
df (pd.DataFrame) – Dataframe with four columns corresponding to raw gauge data in units of counts.
reference_strains (dict) – Dictionary with four entries noting the reference count value on each gauge to zero the timeseries against in the conversion to gauge strain (in microstrain).
gap (float) – Instrument measurement gap. For most NOTA strainmeters, the reference gap is 0.0001 m.
- Returns:
DataFrame with with four columns of microstrain.
- Return type:
pd.DataFrame
- earthscopestraintools.processing.split_into_continuous_dataframes(df)
Method for breaking gappy data into multiple continuous dataframes. Used for decimation of 1s data to 300s.
- Parameters:
df (pd.DataFrame) – non-continuous dataframe
- Returns:
list of continuous dataframes
- Return type:
list[pd.DataFrame]
- earthscopestraintools.processing.spotl_predict_tides(latitude, longitude, elevation, glob_oc, reg_oc, greenf)
Returns the complex numbers (from amplitude and phase) for the predicted areal and shear strains using spotl
Expects regional model polygons to have already been constructed in the working directory (in this case, in the Docker container).
- Parameters:
latitude (float) – Station latitude
longitude (float) – Station longitude
elevation (float) – Station elevation (m)
glob_oc (str) – Global ocean model from SPOTL. e.g. osu.tpxo72.2010
reg_oc (str) – Regional ocean model from SPOTL. e.g. osu.usawest.2010
greenf (str) – Green’s functions for the elastic earth structure from SPOTL. e.g. green.contap.std
- Returns:
Areal, differential, and shear strain complex numbers for the M2 and O1 tides. (eEE+eNN)m2, (eEE-eNN)m2, (2EN)m2, (eEE+eNN)o1, (eEE-eNN)o1, (2EN)o1
- Return type:
dict
- earthscopestraintools.processing.start_df_at_300s(df)
Method to force 1hz data to start at a round 300s timestamp. Will throw away up to 299 s of data. Used for prepping data for 300s decimation.
- Parameters:
df (pd.Dataframe) – dataframe containing 1 hz data
- Returns:
dataframe containing 1 hz data
- Return type:
pd.Dataframe
- earthscopestraintools.processing.update_double_exponential_detrend_params(df, detrend_params)
method to recalculate the six detrend coeffiecients for each channel. As implemented, this iterates on the coeffiecients in the metadata. If there are none,
- Parameters:
df (_type_) – _description_
detrend_params (_type_) – _description_
- Returns:
_description_
- Return type:
_type_