earthscopestraintools.timeseries module

class earthscopestraintools.timeseries.Timeseries(data: DataFrame | None = None, quality_df: DataFrame | None = None, series: str = '', units: str = '', level: str = '', period: float | None = None, name: str | None = None, network: str = '', station: str = '', show_stats: bool = True)

Bases: object

Class for storing strainmeter data or correction timeseries. Each Timeseries object contains the following attributes:

data: (pd.DataFrame) with datetime index and one or more columns of timeseries data
quality_df: (pd.DataFrame) with same shape as data, but with a character mapped to each data point. flags include “g”=good, “m”=missing, “i”=interpolated, “b”=bad
series: (str) description of timeseries, ie ‘raw’, ‘microstrain’, ‘atmp_c’, ‘tide_c’, ‘offset_c’, ‘trend_c’
units: (str) units of timeseries
level: (str) level of data. ie. ‘0’,’1’,’2a’,’2b’
period: (float) sample period of data. If not provided, this will be inferred from the data
name: (str) name of timeseries, including station name. useful for showing stats. defaults to network.station
network: (str) FDSN two character network code
station: (str) FDSN four character station code
show_stats: (bool) Print gap analysis. Defaults to True

append(ts2, in_place=False, show_stats=True)

apply_calibration_matrix(calibration_matrix: array, calibration_matrix_name: str | None = None, use_channels: list = [1, 1, 1, 1], name: str | None = None)

Applies a calibration matrix to convert 4 gauges into areal, differential, and shear strains

Parameters:

calibration_matrix (np.array) – calibration matrix
calibration_matrix_name (str, optional) – name of calibration matrix used, defaults to None
use_channels (list, optional) – not yet implemented, set to 0 to ignore a bad channel, defaults to [1, 1, 1, 1]
name (str, optional) – name for new Timeseries, defaults to None

Returns:

areal, differential, and shear strains based on the given calibration

Return type:

Timeseries

apply_corrections(corrections: list = [], name: str | None = None)

applies one or more corrections to a Timeseries and returns the corrected Timeseries

Parameters:

corrections (list, optional) – List of correction Timeseries to apply, defaults to []
name (str, optional) – name for new Timeseries, defaults to None

Returns:

corrected Timeseries

Return type:

Timeseries

baytap_analysis(atmp_ts, latitude=None, longitude=None, elevation=None, dmin=0.001): This function accesses a docker container to run BAYTAP08 (Tamura 1991; Tamura and Agnew 2008) for tidal analysis. Time series (e.g. strain) and additional auxiliary input (e.g. pressure) are analyzed together to determine the amplitudes and phases of a combination of tidal constituents (M2, O1, P1, K1, N2, S2) in the time series, as well as a coefficient for the auxiliary input response. :param atmp_ts: Atmospheric pressure time series with same sample period and time frame as the strain data. :type atmp_ts: Timeseries :param latitude: latitude of the station :type latitude: float :param longitude: longitude of the station :type longitude: float :param elevation: elevation of the station :type elevation: float :param dmin: Drift parameter for the program. Large drift expects a linear trend. Small drift allows for rapid changes in the residual time series. :type dmin: float :return: Dictionary of amplitudes and phases for each tidal constituent per gauge, and atmospheric pressure coefficient. :rtype: dict

butterworth_filter(filter_type: str, filter_order: int, filter_cutoff_s: float, series: str = '', name: str | None = None)

Apply a butterworth filter to a DataFrame using scipy.signal.butter()

Parameters:

filter_type (str) – {‘lowpass’, ‘highpass’, ‘bandpass’, ‘bandstop’}
filter_order (int) – the order of the filter
filter_cutoff_s (float) – the filter cutoff in seconds
series (str, optional) – series name, defaults to “”
name (str, optional) – name for new Timeseries, defaults to None

Returns:

Timeseries containing filtered data

Return type:

Timeseries

calculate_magnitude(hypocentral_distance, site_term: float = 0, longitude_term=0, name: str | None = None): Calculates a magnitude estimate based on Barbour et al 2021. :param hypocentral_distance: distance from station to event hypocenter, in km :type hypocentral_distance: float :param site_term: site term from Barbour et al 2021, defaults to 0 :type site_term: float, optional :param longitude_term: longitude term from Barbour et al 2021, defaults to 0 :type longitude_term: int, optional :param name: name for new Timeseries, defaults to None :type name: str, optional :return: Timeseries containing a dynamic strain based magnitude estimate as a function of time :rtype: Timeseries

calculate_offsets(limit_multiplier: int = 10, cutoff_percentile: float = 0.75, name: str | None = None)

Calculate offsets using first differencing method (add more details).

Parameters:

limit_multiplier (int, optional) – _description_, defaults to 10
cutoff_percentile (float, optional) – _description_, defaults to 0.75
name (str, optional) – name for new Timeseries, defaults to None

Returns:

_description_

Return type:

_type_

calculate_pressure_correction(response_coefficients: dict, name: str | None = None)

Generate a pressure correction timeseries from pressure data and response coefficients

Parameters:

response_coefficients (dict) – response coefficients for each channel loaded from metadata
name (str, optional) – name for new Timeseries, defaults to None

Returns:

pressure corrections for each channel

Return type:

Timeseries

calculate_tide_correction(tidal_parameters: dict, longitude: float, name: str | None = None)

Generate tidal correction timeseries using SPOTL hartid

Parameters:

tidal_parameters (dict) – tidal parameters loaded from station metadata
longitude (float) – station longitude
name (str, optional) – name for new Timeseries, defaults to None

Returns:

tidal correction Timeseries calculated for each column/channel in input data

Return type:

Timeseries

check_for_gaps(show_stats=True): generates some statistics around nans, fill values, and missing epochs.

decimate_1s_to_300s(method: str = 'linear', limit: int = 3600, name: str | None = None)

decimate 1hz data to 5 min data using

Agnew, Duncan Carr, and K. Hodgkinson (2007), Designing compact causal digital filters for low-frequency strainmeter data , Bulletin Of The Seismological Society Of America, 97, No. 1B, 91-99

Parameters:

method (str, optional) – method to interpolate across gaps, defaults to “linear”
limit (int, optional) – largest gap to interpolate, defaults to 3600 samples
name (str, optional) – name for new Timeseries, defaults to None

Returns:

Timeseries containing 300s decimated data

Return type:

Timeseries

decimate_to_hourly(name: str | None = None)

Decimates a timeseries to hourly by selecting the first and second and minute of each hour

Parameters:

df (pd.DataFrame) – time series data to decimate
name (str, optional) – name for new Timeseries, defaults to None

Returns:

Timeseries containing hourly decimated data

Return type:

Timeseries

double_exponential_trend_correction(detrend_params, name: str | None = None)

dynamic_strain(gauge_weights: list = [1, 1, 1, 1], series='dynamic', name=None)

calculates dynamic strain for a given Timeseries as RMS of gauge strains

Parameters:

gauge_weights (list, optional) – list of which channels to use, defaults to [1, 1, 1, 1]
series (str, optional) – series name, defaults to “dynamic”
name (str, optional) – name for new Timeseries, defaults to None

Returns:

calculated dynamic strain as a Timeseries object

Return type:

Timeseries

get_eig(name: str | None = None): Tool to extract eigenvalues and azimuth’s (from north) from a timeseries with areal (Eee+Enn), differential (Eee-Enn), and engineering shear strain (2Een). :param df: dataframe with areal (Eee+Enn), differential (Eee-Enn), and engineering shear strain (2Een) columns :type df: pd.DataFrame :param name: name for new Timeseries, defaults to None :type name: str, optional :return: Timeseries containing amplitudes and azimuths (degrees from north) for the two eigenvectors :rtype: Timeseries

interpolate(replace: int = 999999, method: str = 'linear', limit_seconds: int = 3600, limit_direction='both', name: str | None = None, new_index: DatetimeIndex | None = None, period=None, level=None, series=None)

Interpolate across gaps in data using pd.DataFrame.interpolate()

Parameters:

replace (int, optional) – gap fill value to interpolate across, defaults to 999999
method (str, optional) – interpolation method, defaults to “linear”
limit_seconds (int, optional) – max gap (in seconds) to interpolate , defaults to 3600
limit_direction (str, optional) – [‘forward’, ‘backward’, ‘both’], defaults to “both”
name (str, optional) – name for new Timeseries, defaults to None
new_index (pd.DatetimeIndex, optional) – option to manually set the index of the interpolated data, defaults to None
period (float, optional) – sample rate of data in seconds, defaults to None
level (str, optional) – level of data, defaults to None
series (str, optional) – series name, defaults to None

Returns:

Timeseries containing interpolated data

Return type:

Timeseries

linear_trend_correction(method='linear', trend_start=None, trend_end=None, name: str | None = None): Generate a linear trend correction :param method: linear or median :type method: str, defaults to linear :param trend_start: start of window to calculate trend, defaults to first_valid_index() :type trend_start: datetime.datetime, optional :param trend_end: end of window to calculate trend, defaults to last_valid_index() :type trend_end: datetime.datetime, optional :param name: name for new Timeseries, defaults to None :type name: str, optional :return: trend correction timeseries for each column/channel in input data :rtype: Timeseries

linearize(reference_strains: dict, gap: float, name: str | None = None)

Processing step to convert digital counts to microstrain based on geometry of GTSM gauges

Parameters:

reference_strains (dict) – dict containing keys of CHX and values of reference strains
gap (float) – instrument gap in meters
name (str, optional) – name for new Timeseries, defaults to None

Returns:

Timeseries of linearized data in microstrain

Return type:

Timeseries

plot(title: str | None = None, remove_9s: bool = False, zero: bool = False, detrend: str | None = None, ymin: float | None = None, ymax: float | None = None, type: str = 'line', show_quality_flags: bool = False, atmp=None, rainfall=None, save_as: str | None = None)

Generic plotting function for Timeseries data

Parameters:

title (str, optional) – plot title, defaults to None
remove_9s (bool, optional) – option to remove gap fill values, defaults to False
zero (bool, optional) – option to zero against first_valid_index(), defaults to False
detrend (str, optional) – signal.detrend type, only ‘linear’ implented currently, defaults to None
ymin (float, optional) – y-axis minimum for plot, defaults to None
ymax (float, optional) – y-axis maximum for plot, defaults to None
type (str, optional) – matplotlib plot type. option of [‘scatter’,’line’], defaults to “line”
show_quality_flags (bool, optional) – option to highlight missing data flags, defaults to False
atmp (Timeseries, optional) – optional Timeseries containing atmospheric pressure data to be plotted in an extra subplot, defaults to None
rainfall (Timeseries, optional) – optional Timeseries containing rainfall data to be plotted in an extra subplot. will also plot cumsum of rainfall during time window. defaults to None
save_as (str, optional) – filename to save as, defaults to None

remove_fill_values(fill_value, interpolate: bool = False, method: str = 'linear', limit_direction: str = 'both', limit: any | None = None, show_stats: bool = True)

remove gap fill values from data, options to either replace with nans or interpolate

Parameters:

interpolate (bool, optional) – boolean of whether to interpolate across gaps using pd.DataFrame.interpolate(), defaults to False
method (str, optional) – interpolation method from pd.DataFrame.interpolate(), defaults to “linear”
limit_direction (str, optional) – limit direction from pd.DataFrame.interpolate(), defaults to “both”
limit (any, optional) – limit from pd.DataFrame.interpolate(), defaults to None
show_stats (Bool, optional) – show gap analysis, defaults to True

Returns:

Timeseries with fill_value gap fills removed, and appropriate flags set

Return type:

Timeseries

save_csv(filename: str, datadir: str = './', sep=',', compression=None)

save data attribute as csv. flattens object, does not save quality flags, level, or version information

Parameters:

filename (str) – name of csv file to save
datadir (str, optional) – path to local directory to save file, defaults to “./”
sep (str, optional) – separator to use in csv, defaults to “,”
compression (str, optional) – compression algorthim [‘infer’, ‘gzip’, ‘bz2’, ‘zip’, ‘xz’, ‘zstd’], defaults to None

set_data(df)

set_local_tdb_uri(local_tdb_uri)

set_s3_tdb_uri(s3_tdb_uri)

set_units(units)

show_flagged_data()

returns dataframe containing any data with a flag other than ‘g’

Returns:: data that has been flagged
Return type:: pandas.DataFrame

show_flags()

returns a dataframe with all flags that are not ‘g’

Returns:: times and channels with flagged data within the timeseries
Return type:: pandas.DataFrame

stats(): displays summary information describing the Timeseries object

Displays a gif of the strain time series provided, with time series and strain axes displayed. Strain is shown relative to the first data point. :param start: (Optional) Start of the video as a datetime string. :type start: str :param end: (Optional) End of the video as a datetime string. :type end: str param skip: (optional) number of data points to skip per frame (eg. if using 5 minute Timeseries, skip=2 will decimate the dataset to a 10 minute period) :type skip: int :param interval: (Optional) Time between frames (in microseconds). :type interval: :type azimuth_arrow: (Optional) Directional arrow to plot behind the strain axes, in degrees (default is None) :param azimuth_arrow: float :param title: (Optional) Plot title :type title: str :param repeat: (Optional) Choose if the animation repeats. Defaults to false. :type repeat: bool :param units: (Optional) Units to label strain :type units: str :return: Gif of the strain time series :rtype: matplotlib.animation

Example

>>> # Import relevant modules from the earscopestraintools package
>>> from earthscopestraintools.mseed_tools import ts_from_mseed
>>> from earthscopestraintools.gtsm_metadata import GtsmMetadata
>>> # Metadata
>>> network = 'PB'
>>> station = 'B004'
>>> meta = GtsmMetadata(network,station)
>>> # Provide the start and end times
>>> start = '2019-07-01'
>>> end = '2019-07-07'
>>>
>>> # load data
>>> strain_raw = ts_from_mseed(network=network, station=station, location='T0', channel='RS*', start=start, end=end)
>>> strain_linearized = strain_raw.linearize(reference_strains=meta.reference_strains,gap=meta.gap)
>>> strain_reg = strain_linearized.apply_calibration_matrix(calibration_matrix=meta.strain_matrices['ER2010'])
>>> # make video, save .gif
>>> %matplotlib widget
>>> anim = strain_reg.strain_video(interval=1, title=f'{station}, One Week',units='ms',savegif=f'{station}.{start}.{end}.gif')

truncate(new_start=None, new_end=None, in_place=False, show_stats=True)

Uses pandas.DataFrame.truncate() to trim the start and/or end of a Timeseries object

Parameters:

new_start (date, str, int, optional) – new beginning of Timeseries, defaults to None
new_end (date, str, int, optional) – new end of Timeseries, defaults to None

Returns:

truncated Timeseries object

Return type:

Timeseries

earthscopestraintools.timeseries.plot_timeseries_comparison(timeseries: list = [], title: str | None = None, names: list = [], remove_9s: bool = False, zero: bool = False, detrend: str | None = None, type: str = 'line', save_as: str | None = None)

plot multiple Timeseries in the same plot to compare values.: useful for viewing uncorrected vs corrected data

Parameters:

timeseries (list, optional) – list of Timeseries to plot, defaults to []
title (str, optional) – plot title, defaults to None
names (list, optional) – list of names to use in legend, defaults to []
remove_9s (bool, optional) – option to remove gap fill values, defaults to False
zero (bool, optional) – option to zero against first_valid_index(), defaults to False
detrend (str, optional) – signal.detrend type, only ‘linear’ implented currently, defaults to None
type (str, optional) – matplotlib plot type. option of [‘scatter’,’line’], defaults to “line”
save_as (str, optional) – filename to save as, defaults to None

earthscopestraintools.timeseries.test()