earthscopestraintools.timeseries module
- class earthscopestraintools.timeseries.Timeseries(data: DataFrame | None = None, quality_df: DataFrame | None = None, series: str = '', units: str = '', level: str = '', period: float | None = None, name: str | None = None, network: str = '', station: str = '', show_stats: bool = True)
Bases:
objectClass for storing strainmeter data or correction timeseries. Each Timeseries object contains the following attributes:
data: (pd.DataFrame) with datetime index and one or more columns of timeseries data
quality_df: (pd.DataFrame) with same shape as data, but with a character mapped to each data point. flags include “g”=good, “m”=missing, “i”=interpolated, “b”=bad
series: (str) description of timeseries, ie ‘raw’, ‘microstrain’, ‘atmp_c’, ‘tide_c’, ‘offset_c’, ‘trend_c’
units: (str) units of timeseries
level: (str) level of data. ie. ‘0’,’1’,’2a’,’2b’
period: (float) sample period of data. If not provided, this will be inferred from the data
name: (str) name of timeseries, including station name. useful for showing stats. defaults to network.station
network: (str) FDSN two character network code
station: (str) FDSN four character station code
show_stats: (bool) Print gap analysis. Defaults to True
- append(ts2, in_place=False, show_stats=True)
- apply_calibration_matrix(calibration_matrix: array, calibration_matrix_name: str | None = None, use_channels: list = [1, 1, 1, 1], name: str | None = None)
Applies a calibration matrix to convert 4 gauges into areal, differential, and shear strains
- Parameters:
calibration_matrix (np.array) – calibration matrix
calibration_matrix_name (str, optional) – name of calibration matrix used, defaults to None
use_channels (list, optional) – not yet implemented, set to 0 to ignore a bad channel, defaults to [1, 1, 1, 1]
name (str, optional) – name for new Timeseries, defaults to None
- Returns:
areal, differential, and shear strains based on the given calibration
- Return type:
- apply_corrections(corrections: list = [], name: str | None = None)
applies one or more corrections to a Timeseries and returns the corrected Timeseries
- Parameters:
corrections (list, optional) – List of correction Timeseries to apply, defaults to []
name (str, optional) – name for new Timeseries, defaults to None
- Returns:
corrected Timeseries
- Return type:
- baytap_analysis(atmp_ts, latitude=None, longitude=None, elevation=None, dmin=0.001)
This function accesses a docker container to run BAYTAP08 (Tamura 1991; Tamura and Agnew 2008) for tidal analysis. Time series (e.g. strain) and additional auxiliary input (e.g. pressure) are analyzed together to determine the amplitudes and phases of a combination of tidal constituents (M2, O1, P1, K1, N2, S2) in the time series, as well as a coefficient for the auxiliary input response. :param atmp_ts: Atmospheric pressure time series with same sample period and time frame as the strain data. :type atmp_ts: Timeseries :param latitude: latitude of the station :type latitude: float :param longitude: longitude of the station :type longitude: float :param elevation: elevation of the station :type elevation: float :param dmin: Drift parameter for the program. Large drift expects a linear trend. Small drift allows for rapid changes in the residual time series. :type dmin: float :return: Dictionary of amplitudes and phases for each tidal constituent per gauge, and atmospheric pressure coefficient. :rtype: dict
- butterworth_filter(filter_type: str, filter_order: int, filter_cutoff_s: float, series: str = '', name: str | None = None)
Apply a butterworth filter to a DataFrame using scipy.signal.butter()
- Parameters:
filter_type (str) – {‘lowpass’, ‘highpass’, ‘bandpass’, ‘bandstop’}
filter_order (int) – the order of the filter
filter_cutoff_s (float) – the filter cutoff in seconds
series (str, optional) – series name, defaults to “”
name (str, optional) – name for new Timeseries, defaults to None
- Returns:
Timeseries containing filtered data
- Return type:
- calculate_magnitude(hypocentral_distance, site_term: float = 0, longitude_term=0, name: str | None = None)
Calculates a magnitude estimate based on Barbour et al 2021. :param hypocentral_distance: distance from station to event hypocenter, in km :type hypocentral_distance: float :param site_term: site term from Barbour et al 2021, defaults to 0 :type site_term: float, optional :param longitude_term: longitude term from Barbour et al 2021, defaults to 0 :type longitude_term: int, optional :param name: name for new Timeseries, defaults to None :type name: str, optional :return: Timeseries containing a dynamic strain based magnitude estimate as a function of time :rtype: Timeseries
- calculate_offsets(limit_multiplier: int = 10, cutoff_percentile: float = 0.75, name: str | None = None)
Calculate offsets using first differencing method (add more details).
- Parameters:
limit_multiplier (int, optional) – _description_, defaults to 10
cutoff_percentile (float, optional) – _description_, defaults to 0.75
name (str, optional) – name for new Timeseries, defaults to None
- Returns:
_description_
- Return type:
_type_
- calculate_pressure_correction(response_coefficients: dict, name: str | None = None)
Generate a pressure correction timeseries from pressure data and response coefficients
- Parameters:
response_coefficients (dict) – response coefficients for each channel loaded from metadata
name (str, optional) – name for new Timeseries, defaults to None
- Returns:
pressure corrections for each channel
- Return type:
- calculate_tide_correction(tidal_parameters: dict, longitude: float, name: str | None = None)
Generate tidal correction timeseries using SPOTL hartid
- Parameters:
tidal_parameters (dict) – tidal parameters loaded from station metadata
longitude (float) – station longitude
name (str, optional) – name for new Timeseries, defaults to None
- Returns:
tidal correction Timeseries calculated for each column/channel in input data
- Return type:
- check_for_gaps(show_stats=True)
generates some statistics around nans, fill values, and missing epochs.
- decimate_1s_to_300s(method: str = 'linear', limit: int = 3600, name: str | None = None)
decimate 1hz data to 5 min data using
Agnew, Duncan Carr, and K. Hodgkinson (2007), Designing compact causal digital filters for low-frequency strainmeter data , Bulletin Of The Seismological Society Of America, 97, No. 1B, 91-99
- Parameters:
method (str, optional) – method to interpolate across gaps, defaults to “linear”
limit (int, optional) – largest gap to interpolate, defaults to 3600 samples
name (str, optional) – name for new Timeseries, defaults to None
- Returns:
Timeseries containing 300s decimated data
- Return type:
- decimate_to_hourly(name: str | None = None)
Decimates a timeseries to hourly by selecting the first and second and minute of each hour
- Parameters:
df (pd.DataFrame) – time series data to decimate
name (str, optional) – name for new Timeseries, defaults to None
- Returns:
Timeseries containing hourly decimated data
- Return type:
- double_exponential_trend_correction(detrend_params, name: str | None = None)
- dynamic_strain(gauge_weights: list = [1, 1, 1, 1], series='dynamic', name=None)
calculates dynamic strain for a given Timeseries as RMS of gauge strains
- Parameters:
gauge_weights (list, optional) – list of which channels to use, defaults to [1, 1, 1, 1]
series (str, optional) – series name, defaults to “dynamic”
name (str, optional) – name for new Timeseries, defaults to None
- Returns:
calculated dynamic strain as a Timeseries object
- Return type:
- get_eig(name: str | None = None)
Tool to extract eigenvalues and azimuth’s (from north) from a timeseries with areal (Eee+Enn), differential (Eee-Enn), and engineering shear strain (2Een). :param df: dataframe with areal (Eee+Enn), differential (Eee-Enn), and engineering shear strain (2Een) columns :type df: pd.DataFrame :param name: name for new Timeseries, defaults to None :type name: str, optional :return: Timeseries containing amplitudes and azimuths (degrees from north) for the two eigenvectors :rtype: Timeseries
- interpolate(replace: int = 999999, method: str = 'linear', limit_seconds: int = 3600, limit_direction='both', name: str | None = None, new_index: DatetimeIndex | None = None, period=None, level=None, series=None)
Interpolate across gaps in data using pd.DataFrame.interpolate()
- Parameters:
replace (int, optional) – gap fill value to interpolate across, defaults to 999999
method (str, optional) – interpolation method, defaults to “linear”
limit_seconds (int, optional) – max gap (in seconds) to interpolate , defaults to 3600
limit_direction (str, optional) – [‘forward’, ‘backward’, ‘both’], defaults to “both”
name (str, optional) – name for new Timeseries, defaults to None
new_index (pd.DatetimeIndex, optional) – option to manually set the index of the interpolated data, defaults to None
period (float, optional) – sample rate of data in seconds, defaults to None
level (str, optional) – level of data, defaults to None
series (str, optional) – series name, defaults to None
- Returns:
Timeseries containing interpolated data
- Return type:
- linear_trend_correction(method='linear', trend_start=None, trend_end=None, name: str | None = None)
Generate a linear trend correction :param method: linear or median :type method: str, defaults to linear :param trend_start: start of window to calculate trend, defaults to first_valid_index() :type trend_start: datetime.datetime, optional :param trend_end: end of window to calculate trend, defaults to last_valid_index() :type trend_end: datetime.datetime, optional :param name: name for new Timeseries, defaults to None :type name: str, optional :return: trend correction timeseries for each column/channel in input data :rtype: Timeseries
- linearize(reference_strains: dict, gap: float, name: str | None = None)
Processing step to convert digital counts to microstrain based on geometry of GTSM gauges
- Parameters:
reference_strains (dict) – dict containing keys of CHX and values of reference strains
gap (float) – instrument gap in meters
name (str, optional) – name for new Timeseries, defaults to None
- Returns:
Timeseries of linearized data in microstrain
- Return type:
- plot(title: str | None = None, remove_9s: bool = False, zero: bool = False, detrend: str | None = None, ymin: float | None = None, ymax: float | None = None, type: str = 'line', show_quality_flags: bool = False, atmp=None, rainfall=None, save_as: str | None = None)
Generic plotting function for Timeseries data
- Parameters:
title (str, optional) – plot title, defaults to None
remove_9s (bool, optional) – option to remove gap fill values, defaults to False
zero (bool, optional) – option to zero against first_valid_index(), defaults to False
detrend (str, optional) – signal.detrend type, only ‘linear’ implented currently, defaults to None
ymin (float, optional) – y-axis minimum for plot, defaults to None
ymax (float, optional) – y-axis maximum for plot, defaults to None
type (str, optional) – matplotlib plot type. option of [‘scatter’,’line’], defaults to “line”
show_quality_flags (bool, optional) – option to highlight missing data flags, defaults to False
atmp (Timeseries, optional) – optional Timeseries containing atmospheric pressure data to be plotted in an extra subplot, defaults to None
rainfall (Timeseries, optional) – optional Timeseries containing rainfall data to be plotted in an extra subplot. will also plot cumsum of rainfall during time window. defaults to None
save_as (str, optional) – filename to save as, defaults to None
- remove_fill_values(fill_value, interpolate: bool = False, method: str = 'linear', limit_direction: str = 'both', limit: any | None = None, show_stats: bool = True)
remove gap fill values from data, options to either replace with nans or interpolate
- Parameters:
interpolate (bool, optional) – boolean of whether to interpolate across gaps using pd.DataFrame.interpolate(), defaults to False
method (str, optional) – interpolation method from pd.DataFrame.interpolate(), defaults to “linear”
limit_direction (str, optional) – limit direction from pd.DataFrame.interpolate(), defaults to “both”
limit (any, optional) – limit from pd.DataFrame.interpolate(), defaults to None
show_stats (Bool, optional) – show gap analysis, defaults to True
- Returns:
Timeseries with fill_value gap fills removed, and appropriate flags set
- Return type:
- save_csv(filename: str, datadir: str = './', sep=',', compression=None)
save data attribute as csv. flattens object, does not save quality flags, level, or version information
- Parameters:
filename (str) – name of csv file to save
datadir (str, optional) – path to local directory to save file, defaults to “./”
sep (str, optional) – separator to use in csv, defaults to “,”
compression (str, optional) – compression algorthim [‘infer’, ‘gzip’, ‘bz2’, ‘zip’, ‘xz’, ‘zstd’], defaults to None
- set_data(df)
- set_local_tdb_uri(local_tdb_uri)
- set_s3_tdb_uri(s3_tdb_uri)
- set_units(units)
- show_flagged_data()
returns dataframe containing any data with a flag other than ‘g’
- Returns:
data that has been flagged
- Return type:
pandas.DataFrame
- show_flags()
returns a dataframe with all flags that are not ‘g’
- Returns:
times and channels with flagged data within the timeseries
- Return type:
pandas.DataFrame
- stats()
displays summary information describing the Timeseries object
- strain_video(start: str | None = None, end: str | None = None, skip: int = 1, interval: float | None = None, azimuth_arrow: float | None = None, title: str | None = None, units: str | None = None, repeat: bool = False, savegif: str | None = None)
Displays a gif of the strain time series provided, with time series and strain axes displayed. Strain is shown relative to the first data point. :param start: (Optional) Start of the video as a datetime string. :type start: str :param end: (Optional) End of the video as a datetime string. :type end: str param skip: (optional) number of data points to skip per frame (eg. if using 5 minute Timeseries, skip=2 will decimate the dataset to a 10 minute period) :type skip: int :param interval: (Optional) Time between frames (in microseconds). :type interval: :type azimuth_arrow: (Optional) Directional arrow to plot behind the strain axes, in degrees (default is None) :param azimuth_arrow: float :param title: (Optional) Plot title :type title: str :param repeat: (Optional) Choose if the animation repeats. Defaults to false. :type repeat: bool :param units: (Optional) Units to label strain :type units: str :return: Gif of the strain time series :rtype: matplotlib.animation
Example
>>> # Import relevant modules from the earscopestraintools package >>> from earthscopestraintools.mseed_tools import ts_from_mseed >>> from earthscopestraintools.gtsm_metadata import GtsmMetadata >>> # Metadata >>> network = 'PB' >>> station = 'B004' >>> meta = GtsmMetadata(network,station) >>> # Provide the start and end times >>> start = '2019-07-01' >>> end = '2019-07-07' >>> >>> # load data >>> strain_raw = ts_from_mseed(network=network, station=station, location='T0', channel='RS*', start=start, end=end) >>> strain_linearized = strain_raw.linearize(reference_strains=meta.reference_strains,gap=meta.gap) >>> strain_reg = strain_linearized.apply_calibration_matrix(calibration_matrix=meta.strain_matrices['ER2010']) >>> # make video, save .gif >>> %matplotlib widget >>> anim = strain_reg.strain_video(interval=1, title=f'{station}, One Week',units='ms',savegif=f'{station}.{start}.{end}.gif')
- truncate(new_start=None, new_end=None, in_place=False, show_stats=True)
Uses pandas.DataFrame.truncate() to trim the start and/or end of a Timeseries object
- Parameters:
new_start (date, str, int, optional) – new beginning of Timeseries, defaults to None
new_end (date, str, int, optional) – new end of Timeseries, defaults to None
- Returns:
truncated Timeseries object
- Return type:
- earthscopestraintools.timeseries.plot_timeseries_comparison(timeseries: list = [], title: str | None = None, names: list = [], remove_9s: bool = False, zero: bool = False, detrend: str | None = None, type: str = 'line', save_as: str | None = None)
- plot multiple Timeseries in the same plot to compare values.
useful for viewing uncorrected vs corrected data
- Parameters:
timeseries (list, optional) – list of Timeseries to plot, defaults to []
title (str, optional) – plot title, defaults to None
names (list, optional) – list of names to use in legend, defaults to []
remove_9s (bool, optional) – option to remove gap fill values, defaults to False
zero (bool, optional) – option to zero against first_valid_index(), defaults to False
detrend (str, optional) – signal.detrend type, only ‘linear’ implented currently, defaults to None
type (str, optional) – matplotlib plot type. option of [‘scatter’,’line’], defaults to “line”
save_as (str, optional) – filename to save as, defaults to None
- earthscopestraintools.timeseries.test()