Seismic Ambient Noise (seislib.an)

As shown in several theoretical studies, the cross correlation of seismic ambient noise can be related to the surface-wave Green’s function between the two points of observation [1]. In case of a diffuse ambient wave-field recorded at two receivers on the vertical component, the empirical Rayleigh-wave Green’s function has a relatively simple expression, that can be employed to constrain the velocity structure of the Earth’s interior. In the frequency domain, this is proportional to a zeroth order Bessel function of the first kind (\(J_0\)), and reads [2]

\[\Re{\big\lbrace\rho({\bf x}_A, {\bf x}_B, \omega)\big\rbrace} \approx J_0\left(\frac{\omega \Delta}{c}\right) \mbox{e}^{-\alpha \Delta},\]

where \(\Delta\), \(\omega\), and c denote inter-station distance, angular frequency, and phase velocity, respectively, \(\rho\) the statistical expectation of the normalized cross-spectrum associated with the two ambient-noise recordings, and \(\Re{\big\lbrace \dotso \big\rbrace}\) maps a complex number into its real part. The exponential damping term in the above equation accounts for the (possibly frequency-dependent) attenuation of the Rayleigh waves propagating between the two receivers \({\bf x}_A\) and \({\bf x}_B\), through the coefficient \(\alpha\) [2].

In ambient-noise seismology, where continuous seismograms of relatively long duration (months or years) are employed, the statistical expectation of \(\rho\) is replaced by an ensemble average of the cross-spectra calculated over a relatively large number of time windows. This contributes to approximating the condition of a diffuse ambient wave-field [1], allowing the use of the above equation to measure the (average) inter-station phase velocity.

In practice, SeisLib makes use of the above equation to calculate

  • Rayleigh and Love phase velocities:

    since phase velocity is related to the phase of the empirical Green’s function, but not to its amplitude, the exponential damping term is neglected, simplifying the problem of retrieving c from the data. This approach resulted in numerous successful applications of velocity imaging and monitoring, and can nowadays be considered standard in ambient-noise tomography [3]. (See seislib.an.an_velocity)

  • Rayleigh-wave attenuation:

    where the attenuation coefficient is retrieved by nonlinear inversion based on preliminary measurements of phase velocity [2]. (See seislib.an.an_attenuation)

References

1(1,2)

Boschi & Weemstra, (2015), Reviews of Geophysics Stationary-phase integrals in the cross correlation, Reviews of Geophysics

2(1,2,3)

Magrini & Boschi, (2021), Surface‐Wave Attenuation From Seismic Ambient Noise: Numerical Validation and Application, JGR

3

Nakata et al., (2019), Seismic ambient noise, Cambridge University Press

Download of Continuous Seismograms

The below class allows for automated downloads of continuous seismograms to be used in seismic-ambient-noise tomography. The data will be saved on disk in the format required by seislib.an.AmbientNoiseVelocity and seislib.an.AmbientNoiseAttenuation to calculate surface-wave dispersion curves and Rayleigh-wave attenuation, respectively.

class seislib.an.an_downloader.ANDownloader(savedir, inventory_name, provider='iris', user=None, password=None, duration=43200, sampling_rate=1, units='disp', prefilter=(0.005, 0.01, 0.25, 0.5), attach_response=False, stations_config={}, sleep_time_after_exception=10, verbose=True, reversed_iterations=False, max_final_stream_duration=126144000)

Bases: object

Downloads seismic data to be used in ambient noise interferometry

Parameters
savedirstr

Absolute path to the directory where the data will be saved. Inside this, a sub-directory named ‘data’ will be created, inside which the continuous data will be stored in the format net.sta.loc.cha.sac. (net=network code, sta=station, loc=location, cha=channel)

inventory_namestr

Name of the inventory (obspy.core.inventory.inventory.Inventory) associated with the downloads. This will be saved in xml format in the savedir

providerstr

Provider of seismic data, passed to obspy.clients.fdsn.Client. Default is iris

user, passwordstr, optional

User name and password for accessing restricted data. Passed to obspy.clients.fdsn.Client

durationfloat, optional

Duration (in seconds) of the individual seismic waveforms to be downloaded. They will be merged together once all individual waveforms from a given station are downloaded. Default is 43200s, i.e., 12h.

sampling_ratefloat

Final sampling rate of the waveforms, in Hz. Default is 1 Hz

prefiltertuple of floats

Bandapass filter applied to the waveforms before removing the intrument response. Default is (0.005, 0.01, 0.25, 0.5).

units{‘DISP’, ‘VEL’, ‘ACC’}

Physical units of the final waveforms. Can be either ‘DISP’, ‘VEL’, or ‘ACC’ (case insensitive). Default is ‘DISP’.

attach_responsebool

If True, the details of the instrument response are attached to the header of the waveforms during their downloads. It can slow down the downloads, but it will make errors due to response information less likely. Default is False

stations_configdict

python dictionary containing optional keys and values passed to obspy.clients.fdsn.Client.get_stations.

sleep_time_after_exceptionfloat

Time to wait (in seconds) after an obspy.clients.fdsn.header.FDSNException. The exception could be due to temporal issues of the client, so waiting a little bit before the next download could be useful to get things back to normality.

verbosebool

If True, information on the downloads will be printed in console

reversed_iterationsbool

If True, the station inventory will be iterated over in a reversed order.

max_final_stream_durationfloat

Maximum lenght (in seconds) of the continuous waveforms. Default is 126144000s, i.e. 4 years. By default, when more than 4 years of data are available, the first 4 years are stored in the ‘data’ directory with the conventional name net.sta.loc.cha.sac; the following batches are saved as net.sta.loc.cha_1.sac (for the 4th-8th years of data), net.sta.loc.cha_2.sac (for the 8th-12th years), net.sta.loc.cha_3.sac (for the 12th-16th years), etc.

Examples

The following will download all the continuous waveforms available from the 1.1.2020 to the 1.1.2021 for the network II and for the channel BH (3 components: BHZ, BHE, BHN), within the region specified by minlatitude, maxlatitude, minlongitude, maxlongitude. Since we do not have access to restricted data, in this case, includerestricted is set to False. See obspy documentation on Client.get_stations for more info on the possible arguments passed to stations_config:

from obspy import UTCDateTime as UTC

stations_config = dict(network='II',
                       channel='BH*',
                       starttime=UTC(2020, 1, 1),
                       endtime=UTC(2021, 1, 1),
                       includerestricted=False,
                       maxlatitude=12,
                       minlatitude=-18,
                       minlongitude=90,
                       maxlongitude=140)

Note

If channel is not specified in stations_config, the default will be ‘HH*’ (i.e., HHZ, HHE, HHN). Multiple channels can be passed using a comma as, e.g., ‘HH*,BH*’, or ‘HHZ,BHZ’. In the downloads, priority is given to the first specified channel (in the above, the BH* is downloaded only if HH* is not available).

We initialize the ANDownloader instance, and then start it:

downloader = ANDownloader(savedir=/path/to/directory, 
                          inventory_name='II.xml',
                          provider='iris',
                          sampling_rate=1,
                          prefilter=(0.005, 0.01, 0.5, 1),
                          units='disp',
                          attach_response=False,
                          stations_config=stations_config,
                          verbose=True)
downloader.start()

Note

You can kill the above process without any fear: the next time you will run the code, it will pick up where it left off.

Hint

If instead you want to modify your station_configs, make sure to either delete the .xml file created at the previous run of the algorithm or use a different inventory_name!

Attributes
savedirstr

Absolute path of the directory that will be created

staclientobspy.clients.fdsn.Client

Provider of station metadata and waveforms

channelslist

List of candidate channels for the downloads

componentsint

Either 3 (ZNE) or 1 (Z)

durationint, float

Duration of the waveforms downloaded (in s)

max_final_stream_durationfloat

Maximum duration of the final continuous seismograms

sampling_ratefloat

Sampling rate of the final seismograms

prefiltertuple of floats

Filter applied to the waveform before removal of the instrument response

unitsstr

Physical units of the final seismogram

attach_responsebool

Whether or not the response information is attached to the waveforms during the download

verbosebool

Whether or not progress information are displayed in the console

reversed_iterationsbool

Whether or not the inventory of stations is reversely iterated over

sleep_time_after_exceptionfloat

The downloads are stopped for the specified time (in s) after a FDSN exception

stations_configdict

Dictionary object with information relevant to the download of the station metadata

inventoryobspy.core.inventory.inventory.Inventory

Stations metadata

_exceptionscollections.defaultdict(int)

Dictionary object with information on the occurred exceptions

_downloadedint

Number of waveforms downloaded

_no_stationsint

Number of stations to download

_stations_doneint

Number of stations for which the downloads have been finished

Methods

active_channels(station)

Channels available for the given station among those to download

adjust_channels(st)

If the stream contains the Z12 channels, these are rotated towards ZNE.

build_inventory(**kwargs)

Builds an obspy inventory containing stations information

collect_waveforms(network, station, ...)

Downloads obspy stream

compile_header(stream, stla, stlo, stel)

Returns the obspy stream with the header compiled in sac format.

endtime_for_download(station)

Gives the end time of the last time window to be downloaded for a given station

handle_multiple_locations(st, station_info)

Automatic selection of location.

inventory_iterator(inventory[, reverse])

Generator function to iterate over an obspy inventory

last_time_downloaded(tmp_folder)

Retrieves the last time downloaded at the previous run of the algorithm

merge_days_and_save(station_code)

Merges all temporary files and saves them as a unique file in the 'data' directory.

operating_times_formatter()

Utility function to print an obspy.UTCDateTime in a nice way

plot_stations([ax, show, oceans_color, ...])

Maps the seismic receivers available for download.

prepare_data(st)

Demean, detrend, tapering, removing response and resampling

preprocessing(st, station)

Preprocessing of the obspy stream

save_tmp_files(stream, outdir, starttime)

Saves temporary files (of duration equal to self.duration) that will be merged and saved in the 'data' directory

select_components(st)

Handles the absence of some components in the final stream

start()

Starts the downloads

starttime_for_download(tmp_folder, station)

Identifies the start time from which the downloads will begin for a given station

station_coordinates(station_info)

Fetch station coordinates (lat, lon, elevation).

station_was_active(station, time)

Wheater or not the seismic station was active at the given time

update_done_stations(station_code)

Updates the 'done_stations.txt' file in the /tmp directory

update_stats([stations, downloaded])

Updates information on number of downloaded waveforms and done stations

active_channels(station)

Channels available for the given station among those to download

Parameters
stationobspy.core.inventory.station.Station
Returns
List of channels
adjust_channels(st)

If the stream contains the Z12 channels, these are rotated towards ZNE.

Parameters
stobspy.core.stream.Stream
Returns
obspy.core.stream.Stream if the rotation is successful, else None
build_inventory(**kwargs)

Builds an obspy inventory containing stations information

The strategy is to first download inventory information at station level. Then, for each station, the instrument response is downloaded for each channel. This may increase the downloading time, but prevents possible “timed out errors”.

Parameters
kwargs :

Additional key word arguments passed to the get_stations method of obspy.clients.fdsn.client

Returns
invobspy.core.inventory.inventory.Inventory
collect_waveforms(network, station, channels, starttime, endtime)

Downloads obspy stream

Parameters
network, station, locationstr

network, station and location codes

channelsiterable

iterable containing the channel codes suited to the download. Each channel will be tried to be used in the downloads. The first successfull attempt determines the returned waveforms. (Higher priority is given to decreasing indexes)

starttime, endtimeobspy.core.utcdatetime.UTCDateTime

start and end time of the stream.

Returns
obspy.core.stream.Stream if the download is successful, else None
compile_header(stream, stla, stlo, stel)

Returns the obspy stream with the header compiled in sac format.

Parameters
streamobspy.core.stream.Stream
stla, stlo, stelfloat

latitude, longitude, elevation of the seismic station

Returns
obspy.core.stream.Stream
endtime_for_download(station)

Gives the end time of the last time window to be downloaded for a given station

Parameters
stationobspy.core.inventory.station.Station
Returns
obspy.core.utcdatetime.UTCDateTime
handle_multiple_locations(st, station_info)

Automatic selection of location.

Parameters
stobspy.core.stream.Stream
station_infoobspy.core.inventory.station.Station
Returns
Obspy stream
classmethod inventory_iterator(inventory, reverse=False)

Generator function to iterate over an obspy inventory

Parameters
inventoryobspy.core.inventory.inventory.Inventory
reversebool

If True, the inventory will be iterated over from bottom to top

Yields
(2,) tuple containing network and station information at each iteration
last_time_downloaded(tmp_folder)

Retrieves the last time downloaded at the previous run of the algorithm

Parameters
tmp_folderstr

Absolute path to the temporary directory where the individual waveforms associated with a given station are saved

Returns
obspy.core.utcdatetime.UTCDateTime.timestamp if files are present in the

temporary directory, else None

merge_days_and_save(station_code)

Merges all temporary files and saves them as a unique file in the ‘data’ directory. If these files exceed self.max_final_stream_duration, they are splitted and saved in several batches.

Parameters
station_codestr

Station code in the format net.sta

classmethod operating_times_formatter()

Utility function to print an obspy.UTCDateTime in a nice way

Parameters
stationobspy.core.inventory.station.Station
Returns
start, endstr

Starting and ending operating time of the seismic station

plot_stations(ax=None, show=True, oceans_color='water', lands_color='land', edgecolor='k', projection='Mercator', resolution='110m', color_by_network=True, legend_dict={}, **kwargs)

Maps the seismic receivers available for download.

Parameters
axcartopy.mpl.geoaxes.GeoAxesSubplot

If not None, the receivers are plotted on the GeoAxesSubplot instance. Otherwise, a new figure and GeoAxesSubplot instance is created

showbool

If True, the plot is shown. Otherwise, a GeoAxesSubplot instance is returned. Default is True

oceans_color, lands_colorstr

Color of oceans and lands. The arguments are ignored if ax is not None. Otherwise, they are passed to cartopy.feature.NaturalEarthFeature (to the argument ‘facecolor’). Defaults are ‘water’ and ‘land’

edgecolorstr

Color of the boundaries between, e.g., lakes and land. The argument is ignored if ax is not None. Otherwise, it is passed to cartopy.feature.NaturalEarthFeature (to the argument ‘edgecolor’). Default is ‘k’ (black)

projectionstr

Name of the geographic projection used to create the GeoAxesSubplot. (Visit the cartopy website for a list of valid projection names.) If ax is not None, projection is ignored. Default is ‘Mercator’

resolution{‘10m’, ‘50m’, ‘110m’}

Resolution of the Earth features displayed in the figure. Passed to cartopy.feature.NaturalEarthFeature. Valid arguments are ‘110m’, ‘50m’, ‘10m’. Default is ‘110m’. Ignored if ax is not None.

color_by_networkbool

If True, each seismic network will have a different color in the resulting map, and a legend will be displayed. Otherwise, all stations will have the same color. Default is True

legend_dictdict, optional

Dictionary of keyword arguments passed to matplotlib.pyplot.legend

**kwargsdict, optional

Additional keyword arguments passed to matplotlib.pyplot.scatter

Returns
If show is True, None, else ax, i.e. the GeoAxesSubplot
prepare_data(st)

Demean, detrend, tapering, removing response and resampling

Parameters
stobspy.core.stream.Stream
Returns
obspy.core.stream.Stream if the processing is successful, else None
preprocessing(st, station)

Preprocessing of the obspy stream

The function calls sequentially the methods handle_multiple_locations(), prepare_data(), and adjust_channels().

Parameters
stobspy.core.stream.Stream
stationobspy.core.inventory.station.Station
Returns
obspy.core.stream.Stream if the preprocessing is successful, else None
save_tmp_files(stream, outdir, starttime)

Saves temporary files (of duration equal to self.duration) that will be merged and saved in the ‘data’ directory

Parameters
streamobspy.core.stream.Stream
outdirstr

Absolute path to saving directory

starttimestr, float (timestamp format)

Starttime of the stream, only used to name the temporary file

select_components(st)

Handles the absence of some components in the final stream

Parameters
stobspy.Stream
Returns
stobspy.Stream

If the expected three-component stream lacks of one or both the horizontal components, the vertical component is returned. If the stream lacks the vertical component but has the two horizontal ones, it returns the horizontal components. If only one horizontal component is available, it returns None.

start()

Starts the downloads

The inventory of stations will be iterated over and, for each station code, all the seismic waveforms available for download will be stored (after after detrending, demeaning, removing the instrument response, and resampling) in a temporary directory ($savedir/tmp). Once, for the same station, the downloads are completed, the individual waveforms are merged to a unique continuous seismogram (if that does not exceed the maximum duration of the stream indicated by the user, otherwise it will be splitted into different files. (See param max_final_stream_duration in ANDownloader)

The algorithm keeps track of the progress made. This allows one to stop the downloads and start from where it was left off at any time.

starttime_for_download(tmp_folder, station)

Identifies the start time from which the downloads will begin for a given station

Parameters
tmp_folderstr

Absolute path to the temporary folder where the individual waveforms associated with a given station are saved

stationobspy.core.inventory.station.Station
Returns
obspy.core.utcdatetime.UTCDateTime
classmethod station_coordinates(station_info)

Fetch station coordinates (lat, lon, elevation).

Parameters
station_infoobspy.core.inventory.station.Station
Returns
latitude, longitude, elevationfloat
classmethod station_was_active(station, time)

Wheater or not the seismic station was active at the given time

Parameters
stationobspy.core.inventory.station.Station
timeobspy.core.utcdatetime.UTCDateTime
Returns
bool
update_done_stations(station_code)

Updates the ‘done_stations.txt’ file in the /tmp directory

Parameters
station_codestr

Station code in the format net.sta

update_stats(stations=False, downloaded=False)

Updates information on number of downloaded waveforms and done stations

Parameters
stations, downloadedbool

If True, the attributes _stations_done and _downloaded are updated.

Inter-Station Dispersion Curves

The below class provides automated routines to calculate Rayleigh and Love inter-station dispersion curves based on the continuous seismograms recorded at pairs of receivers.

class seislib.an.an_velocity.AmbientNoiseVelocity(src, savedir=None, component='Z', verbose=True)

Bases: object

Class to obtain surface-wave phase velocities out of continuous seismograms, via cross-correlation of seismic ambient noise.

Parameters
srcstr

Absolute path to the directory containing the files associated with the continuous seismograms

savedirstr, optional

Absolute path to the directory where the results are saved. If not provided, its value is set to the parent directory of src (i.e., $src/..). All the results will be saved in the directory $savedir/an_velocity/$component (see the component parameter)

component{‘Z’, ‘R’, ‘T’}

Either ‘Z’, ‘R’, (corresponding to vertically and radially polarized Rayleigh waves, respectively), or ‘T’ (for Love waves)

verbosebool

Whether or not information on progress is printed in the console

Examples

The following will calculate the dispersion curves for a set of continuous seismograms stored in the directory src=’/path/to/data’. The data have been downloaded using :class:seislib.an.an_downloader.ANDownloader. We extract dispersion curves for vertically-polarized Rayleigh waves (on the vertical - Z - component).

>>> from seislib.an import AmbientNoiseVelocity
>>> an = AmbientNoiseVelocity(
...         src=src,
...         component='Z')
>>> print(an)    
RAYLEIGH-WAVE VELOCITY FROM SEISMIC AMBIENT NOISE
==================================================
RECEIVERS: X
STATION PAIRS AVAILABLE: Y
DISPERSION CURVES RETRIEVED: 0
==================================================
SOURCE DIR: /path/to/data
SAVE DIR: /path/to/an_velocity/Z

In the above printed information X and Y are integers and depend on your data.

Before extracting the dispersion curves, we need to run:

an.prepare_data()

which will extract information on each seismic receiver from the header of the sac files. These include (i) station coordinates and (ii) the time window spanned by the associated seismogram. The information is saved into two separate files, i.e., /path/to/an_velocity/Z/stations.pickle and /path/to/an_velocity/Z/timespans.pickle. These include (i) stations coordinates and (ii) the time window spanned by the associated seismograms.

Note

If the files containing the continuous seismograms do not have a sac header, or if they have a different extension than .sac, the user can create the two above files on his own, before calling the prepare_data method. (However, each file name should be in the format net.sta.loc.cha’, where net, sta, and loc (optional) are network, station, and location code, respectively, and cha is channel (e.g., BHZ, BHN, or BHE). (If the component is R or T, the NE channels will be loaded and rotated during the calculation of the phase velocities.) For example, IV.AQU.00.BHZ.

The stations.pickle file should contain a dictionary object where each key corresponds to a station code ($net.$sta) and each value should be a tuple containing latitude and longitude of the station. For example:

{ net1.sta1 : (lat1, lon1), net1.sta2 : (lat2, lon2)}

The timespans.pickle file should contain a dictionary object where each key corresponds to a station code ($net.$sta) and each value should be a tuple containing the starttime and endtime (obspy.UTCDateTime.timestamp) of the continuous seismogram associated with the station. For example:

{net1.sta1 : (starttime1, endtime1), net1.sta2 : (starttime2, endtime2)}

After the data have been “prepared”, the receivers available can be displayed by typing:

an.plot_stations()

Now we can calculate the dispersion curves automatically, provided that we pass a reference curve (i.e., ndarray of shape (n, 2), where the 1st column is frequency, the 2nd is phase velocity). For example:

an.extract_dispcurves(refcurve, 
                      freqmin=0.01, 
                      freqmax=0.5, 
                      cmin=1.5,
                      cmax=4.5, 
                      distmax=2000, 
                      window_length=3600, 
                      overlap=0.5, 
                      min_no_days=30)

The above will calculate the dispersion curves for all combinations of station pairs for which the inter-station distance is < 2000 km and at least 30 days of simultaneous recordings are available. Cross-correlations will be computed in the frequency range 0.01-0.5 Hz on 50%-overlapping time windows. The results (ndarrays of shape (m, 2), where the 1st column is frequency and the 2nd is phase velocity) will be saved to /path/to/an_velocity/Z/dispcurves.

Attributes
srcstr

Absolute path to the directory containing the files associated with the continuous seismograms

savedirstr

Absolute path to the directory where the results are saved

component{‘Z’, ‘R’, ‘T’}

Either ‘Z’, ‘R’, or ‘T’

fileslist

List of files to be processed to calculate the dispersion curves

verbosebool

Whether or not information on progress is printed in the console

stationsdict

Station coordinates that can be employed to retrieve dispersion curves (can be accessed after calling the method prepare_data())

Methods

convert_to_kms(dispcurve)

Converts from m/s to km/s the dispersion curve (if necessary).

extract_dispcurves(refcurve[, freqmin, ...])

Automatic extraction of the dispersion curves for all available pairs of receivers.

get_files()

Retrieves the files to be processed for extracting the phase velocities

get_times_and_coords(files)

Retrieves the geographic coordinates and the starting and ending time associated with each continuous seismogram

interpolate_dispcurves(frequency)

Interpolates the dispersion curves found at $self.savedir/dispcurves at the specified frequency.

plot_stations([ax, show, oceans_color, ...])

Maps the seismic receivers for which data are available

prepare_data([recompute])

Saves to disk the geographic coordinates and the starting and ending time associated with each continuous seismogram.

prepare_input_tomography(savedir, period[, ...])

Prepares a .txt file for each specified period, to be used for calculating phase-velocity maps using seislib.tomography.tomography.SeismicTomography.

classmethod convert_to_kms(dispcurve)

Converts from m/s to km/s the dispersion curve (if necessary).

Parameters
dispcurvendarray of shape (n, 2)

The 1st column (typically frequency or period) is ignored. The 2nd should be velocity. If the 1st value of velocity divided by 10 is greater than 1, the 2nd column is divided by 1000. Otherwise, the dispersion curve is left unchanged.

Returns
dispcurvendarray of shape (n, 2)
extract_dispcurves(refcurve, freqmin=0.01, freqmax=0.5, cmin=1.5, cmax=4.5, distmax=None, window_length=3600, overlap=0.5, min_no_days=30, reverse_iteration=False, plotting=False, manual_picking=False)

Automatic extraction of the dispersion curves for all available pairs of receivers.

The results are saved to $self.savedir/dispcurves in .npy format, and consist of ndarrays of shape (n, 2), where the 1st column is frequency and the 2nd phase velocity (in m/s).

The routine iterates over all the available combinations of station pairs and, for each one, (i) computes the cross spectrum (in the frequency domain) by ensamble averaging the cross correlations calculated over relatively small (and possibly overlapping, see overlap) time windows (see window_length), (ii) filters the cross-spectrum using a “velocity” filter [1], and (iii) extracts a smooth dispersion curve by comparison of the zero-crossings of the cross-spectrum with those of the Bessel function associated with the station pair in question [2].

Parameters
refcurvendarray of shape (n, 2)

Reference curve used to extract the dispersion curves. The 1st column should be frequency, the 2nd velocity (in either km/s or m/s). The reference curve is automatically converted to km/s, the physical units employed in the subsequent analysis

freqmin, freqmaxfloat

Minimum and maximum frequency analysed by the algorithm (default are 0.01 and 0.5 Hz). The resulting dispersion curves will be limited to this frequency range

cmin, cmaxfloat

Estimated velocity range (in km/s) spanned by the dispersion curves (default values are 1.5 and 4.5)

distmaxfloat

Maximum inter-station distance (in km), beyond which a given station pair is not considered in the calculation of the phase velocities

window_lenghtfloat

Length of the time windows (in s) used to perform the cross correlations

overlapfloat

Should be >=0 and <1 (strictly smaller than 1). Rules the extent of overlap between one time window and the following [3]. Default is 0.5

min_no_daysint, float

Minimum number of simultaneous recordings available for a given station pair to be considered for the extraction of phase velocity. Default is 30

reverse_iterationbool

If True, the list of combinations of station pairs will be iterated over reversely. This can be useful to run two processes in parallel so as to halve the computation times (one function call setting reverse_iteration`=`False, the other, in a different process, setting it to True). Default is False

plottingbool

If True, a figure is created for each retrieved dispersion curve. This is automatically displayed and saved in $self.savedir/figures

manual_pickingbool

If True, the user is required to pick each dispersion curve manually. The picking is carried out through an interactive plot.

Returns
None

The dispersion curves are saved to $self.savedir/dispcurves

References

1

Magrini & Boschi, (2021), Surface‐Wave Attenuation From Seismic Ambient Noise: Numerical Validation and Application, JGR

2

Kästle et al. 2016, Two-receiver measurements of phase velocity: cross- validation of ambient-noise and earthquake-based observations, GJI

3

Seats et al. 2012, Improved ambient noise correlation functions using Welch’s method, GJI

get_files()

Retrieves the files to be processed for extracting the phase velocities

Returns
fileslist of str

e.g. [‘net1.sta1.00.BHZ.sac’, ‘net1.sta2.00.BHZ.sac’]

get_times_and_coords(files)

Retrieves the geographic coordinates and the starting and ending time associated with each continuous seismogram

Parameters
files: list of str

Names of the files corresponding with the continuous seismograms, located in the src directory

Returns
timesdict

each key corresponds to a station code ($network_code.$station_code) and each value is a tuple containing the starttime and endtime (in timestamp format, see obspy documentation on UTCDateTime.timestamp) of the continuous seismogram associated with the station. For example:

{ net1.sta1 : (starttime1, endtime1),
  net1.sta2 : (starttime2, endtime2),
  net2.sta3 : (starttime3, endtime3)
  }
coordsdict

each key corresponds to a station code ($network_code.$station_code) and each value is a tuple containing latitude and longitude of the station. For example:

{ net1.sta1 : (lat1, lon1),
  net1.sta2 : (lat2, lon2),
  net2.sta3 : (lat3, lon3)
  }
interpolate_dispcurves(frequency)

Interpolates the dispersion curves found at $self.savedir/dispcurves at the specified frequency. (No extrapolation is made.)

Parameters
frequencyfloat, iterable of floats

Frequency (or frequencies) at which the dispersion curves will be interpolated

Returns
coordsndarray of shape (n, 4)

Coordinates (lat1, lon1, lat2, lon2) of the station pairs for which a dispersion curve has been calculated

measurementsndarray of shape (n, f)

Phase velocities calculated for each station pair in coords at the specified frequency. f is the number of frequencies

Note

measurements could contain nans

Examples

Assume you calculated dispersion curves from your data, which are located in /path/to/data, and you had initialized your class:AmbientNoiseVelocity instance, to calculate inter-station dispersion curves, as:

an = AmbientNoiseVelocity(src=SRC,
                          component='Z')
an.prepare_data()

Even if the computation of the dispersion curves is not yet finished, you can use the above instance to interpolate all the available dispersion curves as follows:

frequency = [1/30, 1/20, 1/10, 1/5]
coords, measurements = an.interpolate_dispcurves(frequency)        
plot_stations(ax=None, show=True, oceans_color='water', lands_color='land', edgecolor='k', projection='Mercator', resolution='110m', color_by_network=True, legend_dict={}, **kwargs)

Maps the seismic receivers for which data are available

Parameters
axcartopy.mpl.geoaxes.GeoAxesSubplot

If not None, the receivers are plotted on the GeoAxesSubplot instance. Otherwise, a new figure and GeoAxesSubplot instance is created

showbool

If True, the plot is shown. Otherwise, a GeoAxesSubplot instance is returned. Default is True

oceans_color, lands_colorstr

Color of oceans and lands. The arguments are ignored if ax is not None. Otherwise, they are passed to cartopy.feature.NaturalEarthFeature (to the argument ‘facecolor’). Defaults are ‘water’ and ‘land’

edgecolorstr

Color of the boundaries between, e.g., lakes and land. The argument is ignored if ax is not None. Otherwise, it is passed to cartopy.feature.NaturalEarthFeature (to the argument ‘edgecolor’). Default is ‘k’ (black)

projectionstr

Name of the geographic projection used to create the GeoAxesSubplot. (Visit the cartopy website for a list of valid projection names.) If ax is not None, projection is ignored. Default is ‘Mercator’

resolution{‘10m’, ‘50m’, ‘110m’}

Resolution of the Earth features displayed in the figure. Passed to cartopy.feature.NaturalEarthFeature. Valid arguments are ‘110m’, ‘50m’, ‘10m’. Default is ‘110m’. Ignored if ax is not None.

color_by_networkbool

If True, each seismic network will have a different color in the resulting map, and a legend will be displayed. Otherwise, all stations will have the same color. Default is True

legend_dictdict, optional

Dictionary of keyword arguments passed to matplotlib.pyplot.legend

**kwargsdict, optional

Additional keyword arguments passed to matplotlib.pyplot.scatter

Returns
If show is True, None, else ax, i.e. the GeoAxesSubplot
prepare_data(recompute=False)

Saves to disk the geographic coordinates and the starting and ending time associated with each continuous seismogram. These are saved to $self.savedir/stations.pickle and $self.savedir/timespans.pickle, respectively.

The stations.pickle file contains a dictionary object where each key corresponds to a station code ($network_code.$station_code) and each value is a tuple containing latitude and longitude of the station. For example:

{ net1.sta1 : (lat1, lon1),
  net1.sta2 : (lat2, lon2),
  net2.sta3 : (lat3, lon3)
  }

The timespans.pickle file contains a dictionary object similar to the above, where each value is a tuple containing the starttime and endtime (obspy.UTCDateTime.timestamp) of the continuous seismogram associated with the station. For example:

{ net1.sta1 : (starttime1, endtime1),
  net1.sta2 : (starttime2, endtime2),
  net2.sta3 : (starttime3, endtime3)
  }
Parameters
recomputebool

If True, the station coordinates and times will be removed from disk and recalculated. Otherwise (default), if they are present, they will be loaded into memory, avoiding any computation. This parameter should be set to True whenever one has added new files to the source directory (and has not updated manually stations.pickle and timespans.pickle).

Notes

If component is ‘Z’, only continuous seismograms recorded on the vertical component are considered (e.g., BHZ).

If component is either ‘R’ or ‘T’, however, for a given station to be considered in the calculation of phase velocities both the horizontal components (N and E, e.g. BHN and BHE) should be available. These will be rotated so as to analyse the transverse (T, for Love) and radial (R, for radially-polarized Rayleigh waves) components

prepare_input_tomography(savedir, period, min_no_wavelengths=2, outfile='input_%.2fs.txt')

Prepares a .txt file for each specified period, to be used for calculating phase-velocity maps using seislib.tomography.tomography.SeismicTomography.

Parameters
savedirstr

Absolute path to the directory where the file(s) is (are) saved. If savedir does not exist, it will be created

periodfloat, iterable of floats

Period (or periods) at which the dispersion curves will be interpolated (see interpolate_dispcurves())

min_no_wavelengthsfloat

Ratio between the estimated wavelength \(\lambda = T c_{ref}\) of the surface-wave at a given period T and interstation distance \(\Delta\). Whenever this ratio is > min_no_wavelength, the period in question is not used to retrieve a dispersion measurement. Values < 1 are suggested against. Default is 2

outfilestr

Format for the file names. It must include either %s or %.Xf (where X is integer), since this will be replaced by each period analysed (one for file)

Examples

Assume you calculated dispersion curves from your data, which are located in /path/to/data, and you had initialized your class:AmbientNoiseVelocity instance, to calculate inter-station dispersion curves, as:

an = AmbientNoiseVelocity(src=SRC,
                          component='Z')
an.prepare_data()

You can, even if the computation of the dispersion curves is not finished, use the above instance to extract the information needed to produce phase-velocity maps as follows:

savedir = /arbitrary/path/to/savedir
periods = [5, 6, 8, 10, 12, 15, 20]
an.prepare_input_tomography(savedir=savedir,
                            period=periods)

The above will save one txt file for each specified period.

Rayleigh-Wave Attenuation

The below class provides support for calculating Rayleigh-Wave attenuation based on continuous ambient-noise recordings of a dense seismic array.

As opposed to the problem of calculating phase velocities, which are only related to the phase of the empirical Green’s function, estimates of attenuation should rely on its amplitude. Previous studies showed how the amplitude of the empirical Green’s function should be treated with caution, because of its sensitivity to parameters such as the distribution of noise sources [1]. Accordingly, [2] designed a procedure that should contribute to “regularizing” the subsequent inversion for the Rayleigh-wave attenuation coefficient \(\alpha\), as verified by a suite of numerical tests. Such procedure builds on the previous work of [3], and allows for measuring the frequency-dependency of \(\alpha\).

If we consider a dense seismic array, whose receivers are located at \({\bf x}\), SeisLib first retrieves the normalized cross-spectra associated with each combination of station pairs ij

\[\rho = \Re{ \Bigg\lbrace \frac{u ({\bf x}_i,\omega) u^{\ast} ({\bf x}_j,\omega)} {\left\langle \vert u({\bf x},\omega) \vert^2 \right\rangle_{\bf x}}} \Bigg\rbrace, \]

where u denotes the ambient-noise recording in the frequency domain, \(^{\ast}\) complex conjugation, and \(\Re{\big\lbrace \dotso \big\rbrace}\) maps a complex number into its real part. The normalization term \(\left\langle \vert u({\bf x},\omega) \vert^2 \right\rangle_{\bf x}\) corresponds to the average power spectral density (PSD) recorded by the seismic array.

Then, \(\alpha\) can be constrained by minimizing the cost function [2]

\[C(\alpha, \omega)= \sum_{i,j} \Delta_{ij}^2 \Biggr\rvert \mbox{env}\left[\rho \right] - \mbox{env}\left[ J_0\left(\frac{\omega \Delta_{ij}}{c_{ij}(\omega)}\right) \mbox{e}^{-\alpha(\omega) \Delta_{ij}} \right] \Biggr\rvert^2,\]

where \(\Delta\), \(\omega\), and c denote inter-station distance, angular frequency, and phase velocity, respectively, and \(J_0\) a zeroth order Bessel function of the first kind. The weight \(\Delta_{ij}^2\) is used to compensate for the decrease in the amplitude of \(J_0\) with inter-station distance, due to geometrical spreading, and the envelope function \(\mbox{env}\) has beneficial effects on the stability of the inversion.

In the above equation, phase velocity c is assumed to be known, i.e., it should be calculated in a preliminary step (see seislib.an.an_velocity). The minimum of \(C(\alpha, \omega)\) can then be found via “grid-search” over \(\alpha\), for a discrete set of values of \(\omega\). The alert reader might notice at this point that, since the sum is carried out over all possible combinations of station pairs belonging to the considered array, only one attenuation curve can be extracted from such minimization. This strategy, albeit in a sense restrictive, has been shown to yield robust estimates of the frequency dependency of \(\alpha\) even in presence of a heterogeneous distribution of noise sources [2]. If the array has good azimuthal coverage, using all station pairs as an ensemble in the minimization of \(C(\alpha, \omega)\) allows for sampling most azimuths of wave propagation. In turn, this should “regularize” the inversion by decreasing unwanted effects due to inhomogeneities in the distribution of the noise sources, or compensating for a non-perfectly diffuse ambient seismic field.

Since the above equation only allows to obtain one attenuation curve for seismic array, to retrieve the spatial variations in attenuation SeisLib implements the strategy applied by [4] to USArray data. In practice, parameterize() allows for subdividing a given array in many (possibly overlapping) sub-arrays. These are identified by equal-area blocks of arbitrary size, and used to obtain one attenuation curve for each of them.

References

1

Tsai, (2011). Understanding the amplitudes of noise correlation measurements. JGR

2(1,2,3)

Magrini & Boschi, (2021). Surface‐Wave Attenuation From Seismic Ambient Noise: Numerical Validation and Application. JGR

3

Boschi et al., (2019). On seismic ambient noise cross-correlation and surface-wave attenuation. GJI

4

Magrini et al., (2021). Rayleigh-wave attenuation across the conterminous United States in the microseism frequency band. Scientific Reports

class seislib.an.an_attenuation.AmbientNoiseAttenuation(src, savedir=None, verbose=True)

Bases: object

Class to obtain Rayleigh-wave attenuation out of continuous seismograms, via cross-correlation of seismic ambient noise.

Parameters
srcstr

Absolute path to the directory containing the files associated with the continuous seismograms

savedirstr

Absolute path to the directory where the results are saved

verbosebool

Whether or not information on progress is printed in the console

References

1

Magrini et al., (2021). Rayleigh-wave attenuation across the conterminous United States in the microseism frequency band. Scientific Reports

2

Magrini & Boschi, (2021). Surface‐Wave Attenuation From Seismic Ambient Noise: Numerical Validation and Application. JGR

Examples

In the following, we will calculate an attenuation map for a region characterized by a dense distribution of receivers. When applied to the transportable component of the USArray, the code will produce maps analogous to those in [1]. To do so, we carry out the following steps squentially: (i) define an overlapping parameterization of the study area, (ii) compute the Fourier transforms of our continuous seismograms, (iii) use the thus retrieved Fourier transforms to compute the average power-spectral density in each grid cell of our parameterization; these are used in the calculation of the cross-spectra as normalization term [2]. (iv) Prepare the inversion data-set, for each sub-array associated with each grid cell of our parameterization. This data-set consists of the cross-spectra and of the previously computed inter-station phase velocities. (v) Invert the data-set associated with each grid cell, so as to retrieve one attenuation curve per grid cell, and (vi) plot the results.

First, we need to define an AmbientNoiseAttenuation instance, and retrieve the geographic coordinates for each receivers. Our waveform data are in .sac format, with the header compiled with the needed information. (To download seismic data in the proper format, consider using seislib.an.an_downloader.ANDownloader) These are stored in /path/to/data. We will save the results in /savedir:

src = '/path/to/data'
save = '/savedir'
an = AmbientNoiseAttenuation(src=src, savedir=save, verbose=True)
an.prepare_data()

prepare_data() will automatically save a station.pickle file in $save

Note

If the files containing the continuous seismograms do not have a sac header, or if they have a different extension than .sac, the user can create the station.pickle file on his own, before calling prepare_data(). (However, each file name should be in the format net.sta.loc.cha’, where net, sta, and loc (optional) are network, station, and location code, respectively, and cha is channel (e.g., BHZ, BHN, or BHE). For example, TA.N38A..LHZ.sac.

The stations.pickle file should contain a dictionary object where each key corresponds to a station code ($net.$sta) and each value should be a tuple containing latitude and longitude of the station. For example:

{ net1.sta1 : (lat1, lon1),
net1.sta2 : (lat2, lon2),
net2.sta3 : (lat3, lon3)
}

We now parameterize the study area, by means of an equal-area grid of cell dimensions of 2.5°, with a spatial overlap of 50% on both latitude and longitude. Those grid cells containing less than 6 receivers are discarded:

an.parameterize(cell_size=2.5, 
                overlap=0.5, 
                min_no_stations=6)

The above will create the file paramerization.pickle in $save, and set the corresponding dictionary object to the attribute paramterization of an. This can be accessed typing an.parameterization, and will be used under the hood in the following processing. We now compute the Fourier transforms and the cross-spectra. To compute the Fourier transforms, we use time windows of 6h, and in the calculation of the cross spectra we will discard every grid cell for which at least 6 station pairs having at least 30 days of simultaneous recordings are not available. (fs indicates the sampling rate.):

an.compute_ffts(fs=1, window_length=3600*6)
an.compute_corr_spectra(min_no_pairs=6, 
                        min_no_days=30)

The above operation might require a long time and occupy a relatively large space on disk (even more than the size of the original data, depending on the target sampling rate fs). Three directories (fft, psd, cross-spectra) will be created in $save, where the result of the processing is stored.

Before preparing the final data-set for the inversion, we need to calculate inter-station phase velocities. We did so by using seislib.an.an_velocity.AmbientNoiseVelocity. The retrieved dispersion curves are stored in /path/to/dispcurves:

an.prepare_inversion(src_velocity='/path/to/dispcurves',
                     freqmin=1/15, 
                     freqmax=0.4, 
                     nfreq=300)

The above will prepare the data-set needed by the inversion, using a frequency range of 1/15 - 0.4 Hz, sampled by 300 points so that each frequency is equally spaced from each other on logarithmic scale. The data-set, built for each grid cell individually, is automatically saved to $save/inversion_dataset. To retrieve one attenuation curve for grid cell, we can now run:

an.inversion(alphamin=5e-8, alphamax=1e-4, nalpha=350, min_no_pairs=6)

that will search for the optimal frequency-dependent attenuation coefficient in the range \(5 \times 10^{-8}\) - \(1 \times 10^{-4}\) 1/m via a 2-D grid search carried out over \(\alpha\) and frequency. The \(\alpha\) range is sampled through 350 points equally spaced on a logarithmic scale. All grid cells for which data are not available from at least 6 station pairs are discarded.

The above will save the results in $save/results. These can be visualized in the form of maps at different periods. In the following, we calculate an attenuation map at the period of 4 s, parameterizing it as an equal-area grid of pixel-size = 1°. Only the grid cells of this equal-area grid for which at least 2 attenuation curves are available (remember that we used a 50% overlapping parameterization for the calculation of the attenuation curves) will be attributed an attenuation value. (This behaviour is controlled by the parameter min_overlapping_pixels). Once the attenuation map is created, we plot it:

grid, attenuation = an.get_attenuation_map(period=4, 
                                           cell_size=1,
                                           min_overlapping_pixels=2)
an.plot_map(grid, attenuation)

Note

Once you have prepared the inversion data set by prepare_inversion() (and you are satisfied by the parameters passed therein), you can free the previously occupied space on disk by deleting the directories $save/ffts, $save/psd, since they will not be used in the subsequent inversion.

Attributes
srcstr

Absolute path to the directory containing the files associated with the continuous seismograms

savedirstr

Absolute path to the directory where the results are saved

fileslist

List of files to be processed to calculate the attenuation curves

verbosebool

Whether or not information on progress is printed in the console

stationsdict

Coordinates of the station pairs that can be employed to retrieve dispersion curves (can be accessed after calling the method prepare_data)

parameterizationdict

Information on the parameterization of the study area, consisting of a dictionary object of the form:

{
'grid': np.array([[lat1, lat2, lon1, lon2],
                  [lat3, lat4, lon3, lon4]
                  [etc.]]),
'no_stations': np.array([int1, int2, etc.]),
'station_codes': [[sta1, sta2, sta3, etc.],
                  [sta4, sta5, sta6, etc.]]
}

where lat1, lat2, lon1, lon2 identify the first block of the parameterization, int1 is the number of stations in the first block, and sta1, sta2, sta3, etc. are the associated station codes.

Methods

colormesh(mesh, c, ax, **kwargs)

Adaptation of matplotlib.pyplot.pcolormesh to an (adaptive) equal-area grid.

compute_corr_spectra([min_no_pairs, ...])

Computes the corr-spectra for each pair of receivers in each grid cell of the parameterization.

compute_ffts([fs, window_length])

Computes the Fourier transforms for all the continuous data.

contour(mesh, c, ax[, smoothing])

Adaptation of matplotlib.pyplot.contour to the (adaptive) equal area grid

contourf(mesh, c, ax[, smoothing])

Adaptation of matplotlib.pyplot.contourf to an (adaptive) equal-area grid

get_attenuation_map(period, cell_size[, ...])

Calculates an attenuation maps from the inversion results.

get_files()

Retrieves the files to be processed for extracting Rayleigh-wave attenuation on the vertical component.

get_stations_coords(files)

Retrieves the geographic coordinates associated with each receiver

inversion([alphamin, alphamax, nalpha, ...])

Carries out the inversion for Rayleigh-wave attenuation

parameterize(cell_size[, overlap, ...])

Creates the equal area (possibly overlapping) parameterization used in the subsequent analysis.

plot_azimuthal_coverage(ipixel[, ax, bins, show])

Plots a polar histogram of the azimuthal coverage for one pixel

plot_cost(ipixel[, ax, alphamin, alphamax, ...])

Plots the values of cost obtained in the inversion of a given pixel

plot_map(mesh, c[, ax, projection, ...])

Displays an attenuation map.

plot_stations([ax, show, oceans_color, ...])

Maps the seismic receivers for which data are available

prepare_data([recompute])

Saves to disk the geographic coordinates associated with each receiver.

prepare_inversion(src_velocity[, freqmin, ...])

Prepares the data set to be inverted for Rayleigh-wave attenuation.

classmethod colormesh(mesh, c, ax, **kwargs)

Adaptation of matplotlib.pyplot.pcolormesh to an (adaptive) equal-area grid.

Parameters
meshndarray of shape (n, 4)

Equal area grid, consisting of n pixels described by lat1, lat2, lon1, lon2

carray-like of shape (n,)

Values to plot in each grid cell

axcartopy.mpl.geoaxes.GeoAxesSubplot

If not None, the receivers are plotted on the GeoAxesSubplot instance. Otherwise, a new figure and GeoAxesSubplot instance is created

**kwargsdict, optional

Additional inputs passed to seislib.plotting.plotting.colormesh

Returns
imgInstance of matplotlib.collections.QuadMesh
compute_corr_spectra(min_no_pairs=6, min_no_days=30, ram_available=9000, ram_split=4)

Computes the corr-spectra for each pair of receivers in each grid cell of the parameterization. In practice, all the Fourier transform associated with the continuous seismograms found in a given cell are first loaded into memory. Then, they are used to calculate the average power-spectral density of the grid-cell, which is then used as a normalization term of the cross-correlations (see [1], [2], and [3]).

Two directories are created: $self.savedir/corr_spectra and $self.savedir/psd, where the cross-spectra and the average psd will be saved.

Parameters
min_no_pairsint

Minimum number of pairs of receivers in a given grid cell available for computing the cross-spectrum. Grid cells not reaching this number are not processed. Default is 6. Larger values yield more robust estimates of Rayleigh-wave attenuation [3]. Smaller values are suggested against

min_no_daysint, float

Minimum number of simultaneous recordings available for a given station pair to be considered for the calculation of the cross- spectrum. Station pairs not reaching this number do not count as available station pairs (see min_no_pairs)

ram_availableint, float

RAM available (in Mb) for the analysis. The parameter allows for estabilishing, in a given pixel, if all the Fourier transforms related to the pixel can be loaded into memory or a generator is otherwise required. Lower values are suggested to avoid MemoryErrors, although they would result in a longer computational time. Default is 9000 (i.e., 9 Gb)

ram_splitint

When the (estimated) RAM necessary to load the Fourier transforms of a given pixel into memory exceeds ram_available, the task is splitted in ram_split operations. Larger values help avoiding MemoryErrors, but result in a longer computational time. Default is 4

References

1

Boschi et al., (2019). On seismic ambient noise cross-correlation and surface-wave attenuation. GJI

2

Boschi et al., (2020). Erratum: On seismic ambient noise cross-correlation and surface-wave attenuation. GJI

3(1,2)

Magrini & Boschi (2021). Surface‐Wave Attenuation From Seismic Ambient Noise: Numerical Validation and Application. JGR

compute_ffts(fs=1, window_length=3600)

Computes the Fourier transforms for all the continuous data.

A directory $self.savedir/fft will be created, and all Fourier transforms will be saved within it. Within the same directory, two other files will be created: (i) the frequency vector associated with each Fourier transform, named frequencies.npy, and (ii) a dictionary object where the starting and ending times of all continuous seismograms are stored, named times.pickle

Parameters
fsint, float

Target sampling rate of the continuous seismograms. If some seismogram is characterized by a different sampling rate, it will be resampled

window_lenghtint

Length of the time windows (in s) used to perform the cross-correlations

.. warning::

The operation might take a relatively large space on disk, depending on the amount of continuous seismograms available and on the sampling rate chosen. (Even more than the size of the original data.) This, however, is necessary to speed up (consistently) all the subsequent operations, involving cross-correlations and calculation of the average power spectral density of each sub-array.

classmethod contour(mesh, c, ax, smoothing=None, **kwargs)

Adaptation of matplotlib.pyplot.contour to the (adaptive) equal area grid

Parameters
meshndarray of shape (n, 4)

Equal area grid, consisting of n pixels described by lat1, lat2, lon1, lon2

carray-like of shape (n,)

Values to plot in each grid cell

axcartopy.mpl.geoaxes.GeoAxesSubplot

If not None, the receivers are plotted on the GeoAxesSubplot instance. Otherwise, a new figure and GeoAxesSubplot instance is created

smoothingfloat, optional

Value passed to scipy.ndimage.filters.gaussian_filter and used to obtain a smooth representation of c (default is None)

**kwargsdict, optional

Additional inputs passed to seislib.plotting.plotting.colormesh

Returns
imgInstance of matplotlib.contour.QuadContourSet
classmethod contourf(mesh, c, ax, smoothing=None, **kwargs)

Adaptation of matplotlib.pyplot.contourf to an (adaptive) equal-area grid

Parameters
meshndarray of shape (n, 4)

Equal area grid, consisting of n pixels described by lat1, lat2, lon1, lon2

carray-like of shape (n,)

Values to plot in each grid cell

axcartopy.mpl.geoaxes.GeoAxesSubplot

If not None, the receivers are plotted on the GeoAxesSubplot instance. Otherwise, a new figure and GeoAxesSubplot instance is created

smoothingfloat, optional

Value passed to scipy.ndimage.filters.gaussian_filter and used to obtain a smooth representation of c (default is None)

**kwargsdict, optional

Additional inputs passed to seislib.plotting.plotting.colormesh

Returns
imgInstance of matplotlib.contour.QuadContourSet
get_attenuation_map(period, cell_size, min_overlapping_pixels=2)

Calculates an attenuation maps from the inversion results.

The map will be parameterized using an equal-area grid, built thorugh seislib.tomography.grid.EqualAreaGrid

Parameters
periodfloat, int
cell_sizefloat, int

Size of each grid cell (in degrees) of the resulting parameterization

min_overlapping_pixels: int

At a given period, each value of \(\alpha\) in the resulting map will be given by a weighted average of the attenuation curves retrieved in the inversion (the weights are determined by the number of station pairs used in the inversion of each sub-array). This parameters determines the minimum number of attenuation curves used to constrain \(\alpha\) in a given pixel of the resulting map. Default is 2

Returns
new_meshndarray of shape (n, 4)

Parameterization of the final map, where each row corresponds to a grid cell and the columns define its boundaries (lat1, lat2, lon1, lon2)

attenuationndarray of shape (n,)

Value of attenuation associated with each grid cell of new_mesh

get_files()

Retrieves the files to be processed for extracting Rayleigh-wave attenuation on the vertical component.

Returns
fileslist of str

e.g., [‘net1.sta1.00.BHZ.sac’, ‘net1.sta2.00.BHZ.sac’]

get_stations_coords(files)

Retrieves the geographic coordinates associated with each receiver

Parameters
files: list of str

Names of the files corresponding with the continuous seismograms, located in the src directory

Returns
coordsdict

each key corresponds to a station code ($network_code.$station_code) and each value is a tuple containing latitude and longitude of the station. For example:

{ net1.sta1 : (lat1, lon1),
  net1.sta2 : (lat2, lon2),
  net2.sta3 : (lat3, lon3)
  }
inversion(alphamin=5e-07, alphamax=0.0005, nalpha=350, min_no_pairs=6)

Carries out the inversion for Rayleigh-wave attenuation

For each pickle file available in $self.savedir/inversion_dataset, an inversion is carried out by minimizing the misfit between the corr- spectra and the corresponding Bessel functions, so as to retrieve one attenuation curve for each pixel [1].

The results (one attenuation curve per pickle file), are saved at $self.savedir/results, and consist of (i) best_alphas.pickle: a dictionary object containing, for each grid cell, the retrieved attenuation curve and the number of station pairs used to constrain it; (ii) frequencies.npy: frequency vector associated with the attenuation curves; (iii) pixel$d.pickle (where $d indicates the index of the grid cell in $self.savedir/parameterization.pickle): dictionary object containing, along with the information stored in (i), the minimized misfit matrix to retrieve the attenuation curve.

Parameters
alphamin, alphamaxfloat

Attenuation range (in 1/m) within which the best-fitting attenuation is sought as a function of frequency. Defaults are 5e-7 and 5e-4

nalphaint

Number of values of \(\alpha\) used to subdivide the attenuation range geometrically (see numpy.geomspace). Default is 350

min_no_pairsint

Minimum number of stations pairs required to carry out the inversion. If, in a given sub-array, the number of available data is smaller than this number, the pixel is not inverted for. Default is 6. Smaller values are suggested against.

References

1

Magrini & Boschi (2021). Surface‐Wave Attenuation From Seismic Ambient Noise: Numerical Validation and Application. JGR

parameterize(cell_size, overlap=0.5, min_no_stations=6, regular_grid=False, plotting=True, plot_resolution='110m')

Creates the equal area (possibly overlapping) parameterization used in the subsequent analysis. The equal-area grid is created through the :class:seislib.tomography.grid.EqualAreaGrid class of seislib.

The parameterization is saved at $savedir/parameterization.pickle and set as an attribute of the AmbientNoiseAttenuation instance under the name parameterization. The corresponding dictionary object includes information on the parameterization of the study area, and has the form:

{
'grid': np.array([[lat1, lat2, lon1, lon2],
                  [lat3, lat4, lon3, lon4]
                  [etc.]]),
'no_stations': np.array([int1, int2, etc.]),
'station_codes': [[sta1, sta2, sta3, etc.],
                  [sta4, sta5, sta6, etc.]]
}

where lat1, lat2, lon1, lon2 identify the first block of the parameterization, int1 is the number of stations in the first block, and sta1, sta2, sta3, etc. are the associated station codes.

Parameters
cell_sizefloat

Size of each grid cell of the resulting parameterization (in degrees)

overlapfloat

If > 0, the parameterization will be overlapping in space by the specified extent [1]. Default is 0.5

min_no_stationsint

Minimum number of stations falling within each grid-cell. If the value is not reached, the grid-cell in question is removed from the parameterization

regular_gridbool

If False (default), the study area is discretized using an equal-area parameterization. Otherwise, a regular grid is employed.

plottingbool

If True, a figure on the resulting parameterization is displayed

plot_resolution{‘10m’, ‘50m’, ‘110m’}

Resolution of the Earth features displayed in the figure. Passed to cartopy.feature.NaturalEarthFeature. Valid arguments are ‘110m’, ‘50m’, ‘10m’. Default is ‘110m’

References

1

Magrini et al., (2021). Rayleigh-wave attenuation across the conterminous United States in the microseism frequency band. Scientific Reports

plot_azimuthal_coverage(ipixel, ax=None, bins=None, show=True, **kwargs)

Plots a polar histogram of the azimuthal coverage for one pixel

Parameters
ipixelint

Index of the pixel used in the inversion for alpha

axmatplotlib instance of PolarAxesSubplot, optional

If None, a new figure and PolarAxesSubplot is created

binsarray-like, optional

Azimuthal bins (in degrees). If None, bins of 30 degrees between 0 and 360 are used

showbool

If False, ax is returned. Else, the figure is displayed. Default is True

**kwargsdict, optional

Additional arguments passed to matplotlib.pyplot.bar

Returns
None if show is True, else ax
plot_cost(ipixel, ax=None, alphamin=None, alphamax=None, freqmin=None, freqmax=None, show=True, curve_dict={}, **kwargs)

Plots the values of cost obtained in the inversion of a given pixel

Parameters
ipixelint

Index of the pixel used in the inversion for alpha

axmatplotlib instance of AxesSubplot, optional

If None, a new figure and ax is created

showbool

If False, ax is returned. Else, the figure is displayed.

alphamin, alphamaxfloat, optional

Alpha range used to bound the yaxis in the figure. If None, the whole alpha range used in the inversion is displayed

freqmin, freqmaxfloat, optional

Frequency range used to bound the xaxis in the figure. If None, the whole Frequency range used in the inversion is displayed

showbool

If False, ax is returned. Else, the figure is displayed. Default is True

curve_dictdict, optional

Arguments passed to matplotlib.pyplot.plot, to control the aspect of the attenuation curve

**kwargsdict, optional

Additional arguments passed to matplotlib.pyplot.pcolormesh

Returns
If show is True, None, else a (2,) tuple containing
  • matplotlib.collections.QuadMesh (associated with pcolormesh);
  • matplotlib.pyplot.colorbar
classmethod plot_map(mesh, c, ax=None, projection='Mercator', map_boundaries=None, bound_map=True, colorbar=True, show=True, style='colormesh', add_features=False, resolution='110m', cbar_dict={}, **kwargs)

Displays an attenuation map.

Parameters
meshndarray of shape (n, 4)

Grid cells in SeisLib format, consisting of n pixels described by lat1, lat2, lon1, lon2 (in degrees).

cndarray of shape (n,)

Values associated with the grid cells (mesh)

axcartopy.mpl.geoaxes.GeoAxesSubplot

If not None, c is plotted on the GeoAxesSubplot instance. Otherwise, a new figure and GeoAxesSubplot instance is created

projectionstr

Name of the geographic projection used to create the GeoAxesSubplot. (Visit the cartopy website for a list of valid projection names.) If ax is not None, projection is ignored. Default is ‘Mercator’

map_boundariesiterable of floats, shape (4,), optional

Lonmin, lonmax, latmin, latmax (in degrees) defining the extent of the map

bound_mapbool

If True, the map boundaries will be automatically determined. Default is False

colorbarbool

If True (default), a colorbar associated with c is displayed on the side of the map

showbool

If True (default), the map will be showed once generated

style{‘colormesh’, ‘contourf’, ‘contour’}

Possible options are ‘colormesh’, ‘contourf’, and ‘contour’. Default is ‘colormesh’, corresponding to colormesh().

add_featuresbool

If True, natural Earth features will be added to the GeoAxesSubplot through seislib.plotting.plotting.add_earth_features(). Default is False. If ax is None, it is automatically set to True

resolution{‘10m’, ‘50m’, ‘110m’}

Resolution of the Earth features displayed in the figure. Passed to cartopy.feature.NaturalEarthFeature. Valid arguments are ‘110m’, ‘50m’, ‘10m’. Default is ‘110m’. Ignored if ax is not None.

cbar_dictdict, optional

Keyword arguments passed to matplotlib.colorbar.ColorbarBase

**kwargsdict, optional

Additional inputs passed to the method defined by style.

Returns
None if show is False else an instance of
matplotlib.collections.QuadMesh (together with an instance
of matplotlib.colorbar.Colorbar, if colorbar is True)
plot_stations(ax=None, show=True, oceans_color='water', lands_color='land', edgecolor='k', projection='Mercator', resolution='110m', color_by_network=True, legend_dict={}, **kwargs)

Maps the seismic receivers for which data are available

Parameters
axcartopy.mpl.geoaxes.GeoAxesSubplot

If not None, the receivers are plotted on the GeoAxesSubplot instance. Otherwise, a new figure and GeoAxesSubplot instance is created

showbool

If True, the plot is shown. Otherwise, a GeoAxesSubplot instance is returned. Default is True

oceans_color, lands_colorstr

Color of oceans and lands. The arguments are ignored if ax is not None. Otherwise, they are passed to cartopy.feature.NaturalEarthFeature (to the argument ‘facecolor’). Defaults are ‘water’ and ‘land’

edgecolorstr

Color of the boundaries between, e.g., lakes and land. The argument is ignored if ax is not None. Otherwise, it is passed to cartopy.feature.NaturalEarthFeature (to the argument ‘edgecolor’). Default is ‘k’ (black)

projectionstr

Name of the geographic projection used to create the GeoAxesSubplot. (Visit the cartopy website for a list of valid projection names.) If ax is not None, projection is ignored. Default is ‘Mercator’

resolution{‘10m’, ‘50m’, ‘110m’}

Resolution of the Earth features displayed in the figure. Passed to cartopy.feature.NaturalEarthFeature. Valid arguments are ‘110m’, ‘50m’, ‘10m’. Default is ‘110m’. Ignored if ax is not None.

color_by_networkbool

If True, each seismic network will have a different color in the resulting map, and a legend will be displayed. Otherwise, all stations will have the same color. Default is True

legend_dictdict, optional

Dictionary of keyword arguments passed to matplotlib.pyplot.legend

**kwargsdict, optional

Additional keyword arguments passed to matplotlib.pyplot.scatter

Returns
None if show is True, else ax, i.e. the GeoAxesSubplot
prepare_data(recompute=False)

Saves to disk the geographic coordinates associated with each receiver. These are saved to self.savedir/stations.pickle

The stations.pickle file contains a dictionary object where each key corresponds to a station code ($network_code.$station_code) and each value is a tuple containing latitude and longitude of the station. For example:

{ net1.sta1 : (lat1, lon1),
  net1.sta2 : (lat2, lon2),
  net2.sta3 : (lat3, lon3)
  }

If self.savedir/parameterization.pickle exists, the corresponding dictionary object is set as an attribute of the AmbientNoiseAttenuation instance under the name parameterization. This includes information on the parameterization of the study area, and has the form:

{
'grid': np.array([[lat1, lat2, lon1, lon2],
                  [lat3, lat4, lon3, lon4]
                  [etc.]]),
'no_stations': np.array([int1, int2, etc.]),
'station_codes': [[sta1, sta2, sta3, etc.],
                  [sta4, sta5, sta6, etc.]]
}

where lat1, lat2, lon1, lon2 identify the first block of the parameterization, int1 is the number of stations in the first block, and sta1, sta2, sta3, etc. are the associated station codes.

Parameters
recomputebool

If True, the station coordinates and times will be removed from disk and recalculated. Otherwise (default), if they are present, they will be loaded into memory, avoiding any computation. This parameter should be set to True whenever one has added new files to the source directory

prepare_inversion(src_velocity, freqmin=0.05, freqmax=0.4, nfreq=300, smooth=False, smoothing_window=25, smoothing_poly=2)

Prepares the data set to be inverted for Rayleigh-wave attenuation.

The data set used in the inversion, for each station pair, consists of (i) envelope of the cross-spectrum, (ii) envelope of the associated Bessel function \(J_0\), (iii) phase velocity, (iv) inter-station distance. These objects are collected, for each grid cell associated with the parameterization, for all station pairs available, and stored into a dictionary object that is written on disk at $self.savedir/inversion_dataset. The individual dictionaries are saved in the pickle format and named after the index of the grid cell in the belonging parameterization.

Parameters
src_velocitystr

Absolute path to the directory containing the phase-velocity dispersion curves associated with the station pairs for which cross-spectra are available

freqmin, freqmaxfloat

Frequency range to analyse. Default is 0.05 and 0.4 Hz

nfreqfloat

The frequency range in question will be subdivided in nfreq points geometrically spaced (see numpy.geomspace)

smoothbool

If True, the cross-spectra envelopes are smoothing with a Savitzky-Golay filter. Default is False

smoothing_windowodd int

Passed to scipy.signal.savgol_filter. Default is 25

smoothing_polyint

Passed to scipy.signal.savgol_filter. Default is 2

Lower-Level Support

This module provides useful functions for the processing of continuous recordings of seismic ambient noise. All these functions are called, under the hood, by seislib.an.AmbientNoiseVelocity and seislib.an.AmbientNoiseAttenuation.

seislib.an.an_processing.extract_dispcurve(frequencies, corr_spectrum, interstation_distance, ref_curve, freqmin=0, freqmax=1, cmin=1, cmax=5, filt_width=3, filt_height=1.0, x_step=None, pick_threshold=2, horizontal_polarization=False, manual_picking=False, plotting=False, savefig=None)

Function for picking the phase-velocity curve from the zero crossings of the frequency-domain representation of the stacked cross correlation.

The picking procedure is based on drawing an ellipse around each zero crossing and assigning a weight according to the distance from the zero crossing to the ellipse boundary. The weights are then stacked in a phase-velocity - frequency plane and a smoothed version of the zero-crossing branches is obtained. Because of the similarity to a kernel-density estimate, the elliptical shapes are called kernels in this code.

This procedure reduces the influence of spurious zero crossings due to data noise, makes it easier to identify the well constrained parts of the phase- velocity curve and to obtain a smooth dispersion curve. A reference dispersion curve must be given to guide the algorithm in finding the correct phase-velocity branch to start the picking, because parallel branches are subject to a \(2 \pi\) ambiguity.

Parameters
frequenciesndarray of shape (n,)

Frequency vector of the cross-spectrum

corr_spectrumndarray of shape (n,)

Real or complex-valued array containing the cross-correlation. (Only the real part is used.)

interstation_distancefloat

Inter-station distance in km

ref_curve: ndarray of shape (m, 2)

Reference phase velocity curve, where the 1st column is frequency, the 2nd is velocity in km/s. The final phase-velocity curve is only picked within the frequency range spanned by the reference curve.

freqmin, freqmaxfloat

Restricts the zero crossings to the corresponding frequency range. Values in the frequency vector outside this range are ignored. Default values are 0 and 1

cmin, cmaxfloat (in km/s)

Min and max (estimated) surface-wave velocities. Only zero crossings corresponding to velocities within this range are considered. Default values are 1 and 5

filt_widthint

Controls the width of the smoothing window. Corresponds to the number of zero crossings that should be within one window. Default is 3

filt_heightfloat

Controls the height of the smoothing window. Corresponds to the portion of a cycle jump. Should never exceed 1, otherwise it will smooth over more than one cycle. Default is 1

x_step: float

Controls the step width for the picking routine along with the x (frequency) axis. Expressed in fractions of the expected step width between two zero crossings. If not provided, it is chosen automatically

horizontal_polarizationbool

If True, the zero crossings from the cross-spectrum are compared to the difference function \(J_0 - J_2\) (Bessel functions of the first kind of order 0 and 2 respectively) [1]. Should be True for Love and radially-polarized Rayleigh waves. If False (default) only \(J_0\) is used

manual_pickingbool

If True, the user is required to pick the dispersion curve manually. The picking is carried out through an interactive plot.

plottingbool

If True, a control plot is created and information on the picking procedure are printed. Default is False

savefigstr

Absolute path. If not None, and if plotting is True, the control plot is saved on disk at the absolute path provided

Returns
crossingsndarray of shape (k, 2)

2-D array containing frequencies (1st column) and corresponding velocities (2nd column, in km/s) at the zero crossings used to pick the dispersion curve

Dispersion curvendarray of shape (l, 2)

Picked phase-velocity curve, where the 1st column is frequency and the 2nd is velocity in km/s

References

1

Kästle et al. 2016, Two-receiver measurements of phase velocity: cross-validation of ambient-noise and earthquake-based observations, GJI

seislib.an.an_processing.get_zero_crossings(freq, xcorr, dist, freqmin=0, freqmax=1, cmin=1, cmax=5, horizontal_polarization=False, return_indexes=False)

Returns the zero crossings from the smoothed complex cross-correlation spectrum.

Parameters
freqndarray of shape (n,)

Frequency vector

xcorrndarray of shape (n,)

Real or complex-valued array containing the cross-correlation. (Only the real part is used.)

distfloat

Interstation distance in km

freqmin, freqmaxfloat

Restricts the zero crossings to the corresponding frequency range. Values in frequency vector outside this range are ignored. Default values are 0 and 1

cmin, cmaxfloat (in km/s)

Min and max (estimated) surface-wave velocities. Only zero crossings corresponding to velocities within this range are considered. Default values are 1 and 5

horizontal_polarizationbool

If True, the zero crossings from the cross-spectrum are compared to the difference function \(J_0 - J_2\) (Bessel functions of the first kind of order 0 and 2 respectively) [1]. Should be True for Love and radially-polarized Rayleigh waves. If False (default) only \(J_0\) is used

return_indexesbool

If True, the different branches in the zero-crossings are enumerated and returned. Default is False

Returns
If return_indexes is False (default):
ndarray of shape (m, 2)

2-D array containing frequencies (1st column) and corresponding velocities (2nd column, in km/s) at the zero crossings

Otherwise:
ndarray of shape (m, 3)

Where the third column is the index corresponding to each branch

References

1

Kästle et al. 2016, Two-receiver measurements of phase velocity: cross- validation of ambient-noise and earthquake-based observations, GJI

seislib.an.an_processing.noisecorr(tr1, tr2, window_length=3600, overlap=0.5, whiten=True, psd=False, waterlevel=1e-10)

Cross correlation of continuous seismograms.

The two seismograms are first sliced to a common time window and, if they are characterized by two different sampling rate, the one with the larger sampling rate is downsampled. The two seismograms are then subdivided into (possibly overlapping) time windows, which are cross correlated in the frequency domain. The cross correlations are then stacked and ensemble averaged to obtain the final cross spectrum in the frequency domain.

Parameters
tr1, tr2obspy.Trace

Continuous seismograms to be cross-correlated

window_lenghtfloat

Length of the time windows (in seconds) used to perform the cross correlations. Default is 3600

overlapfloat

Should be >=0 and <1 (strictly smaller than 1). Rules the extent of overlap between one time window and the following [2]. Default is 0.5

whitenbool

Whether or not the individual cross correlations are normalized by whitening [1]. Default is True

psdbool

Whether or not the individual cross correlations are normalized by the average psd. Default is False. If whiten is True, psd is ignored

waterlevelfloat

Only applied if whiten is True. It prevents ZeroDivisionError. Default is 1e-10

Returns
freqndarray of shape (n,)

Frequency associated with the cross spectrum. It is calculated as np.fft.rfftfreq(win_samples, dt), where dt is the time delta in the input seismograms, and win_samples is int(window_length / dt)

corr_spectrumndarray of shape (n,)

Ensemble average of the individual cross correlations

Raises
NonFiniteDataException

If one of the two traces contains non-finite elements

TimeSpanException

If the input traces do not span common times

References

1

Bensen et al. 2007, Processing seismic ambient noise data to obtain reliable broad-band surface wave dispersion measurements, GJI

2

Seats et al. 2012, Improved ambient noise correlation functions using Welch’s method, GJI

seislib.an.an_processing.velocity_filter(freq, corr_spectrum, interstation_distance, cmin=1.0, cmax=5.0, p=0.05)

Filters a frequency-domain cross-spectrum so as to remove all signal corresponding to a specified velocity range.

In practice, the procedure (i) inverse-Fourier transforms the cross spectrum to the time domain; (ii) it zero-pads the resulting time-domain signal at times corresponding to velocities outside the specified velocity range by applying a cosine taper (the same cosine taper is applied at the two ends of the interval); (iii) a forward-Fourier transform brings back the padded cross correlation to the frequency domain [1].

Parameters
freqndarray of shape (n,)

Frequency vector

cross_spectrumndarray of shape (n,)

Complex-valued frequency-domain cross_spectrum

interstation_distancefloat (in km)
cmin, cmaxfloat (in km/s)

Velocity range. Default values are 1 and 5

pfloat

Decimal percentage of cosine taper. Default is 0.05 (5%)

Returns
corrndarray of shape (n,)

Filtered cross-spectrum

References

1

Sadeghisorkhani et al. 2018, GSpecDisp: a matlab GUI package for phase-velocity dispersion measurements from ambient-noise correlations, Computers & Geosciences