{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Batch Prediction\n", "\n", "## 1. Download demo data\n", "\n", "```\n", "cd PhaseNet\n", "wget https://github.com/wayneweiqiang/PhaseNet/releases/download/test_data/test_data.zip\n", "unzip test_data.zip\n", "```\n", "\n", "## 2. Run batch prediction \n", "\n", "PhaseNet currently supports four data formats: mseed, sac, hdf5, and numpy. \n", "\n", "- For mseed format:\n", "```\n", "python phasenet/predict.py --model=model/190703-214543 --data_list=test_data/mseed.csv --data_dir=test_data/mseed --format=mseed --plot_figure\n", "```\n", "\n", "- For sac format:\n", "```\n", "python phasenet/predict.py --model=model/190703-214543 --data_list=test_data/sac.csv --data_dir=test_data/sac --format=sac --plot_figure\n", "```\n", "\n", "- For numpy format:\n", "```\n", "python phasenet/predict.py --model=model/190703-214543 --data_list=test_data/npz.csv --data_dir=test_data/npz --format=numpy --plot_figure\n", "```\n", "\n", "- For hdf5 format:\n", "```\n", "python phasenet/predict.py --model=model/190703-214543 --hdf5_file=test_data/data.h5 --hdf5_group=data --format=hdf5 --plot_figure\n", "```\n", "\n", "- For a seismic array (used by [QuakeFlow](https://github.com/wayneweiqiang/QuakeFlow)):\n", "```\n", "python phasenet/predict.py --model=model/190703-214543 --data_list=test_data/mseed_array.csv --data_dir=test_data/mseed_array --stations=test_data/stations.json --format=mseed_array --amplitude\n", "```\n", "```\n", "python phasenet/predict.py --model=model/190703-214543 --data_list=test_data/mseed2.csv --data_dir=test_data/mseed --stations=test_data/stations.json --format=mseed_array --amplitude\n", "```\n", "\n", "Notes: \n", "1. Remove the \"--plot_figure\" argument for large datasets, because plotting can be very slow.\n", "\n", "Optional arguments:\n", "```\n", "usage: predict.py [-h] [--batch_size BATCH_SIZE] [--model_dir MODEL_DIR]\n", " [--data_dir DATA_DIR] [--data_list DATA_LIST]\n", " [--hdf5_file HDF5_FILE] [--hdf5_group HDF5_GROUP]\n", " [--result_dir RESULT_DIR] [--result_fname RESULT_FNAME]\n", " [--min_p_prob MIN_P_PROB] [--min_s_prob MIN_S_PROB]\n", " [--mpd MPD] [--amplitude] [--format FORMAT]\n", " [--s3_url S3_URL] [--stations STATIONS] [--plot_figure]\n", " [--save_prob]\n", "\n", "optional arguments:\n", " -h, --help show this help message and exit\n", " --batch_size BATCH_SIZE\n", " batch size\n", " --model_dir MODEL_DIR\n", " Checkpoint directory (default: None)\n", " --data_dir DATA_DIR Input file directory\n", " --data_list DATA_LIST\n", " Input csv file\n", " --hdf5_file HDF5_FILE\n", " Input hdf5 file\n", " --hdf5_group HDF5_GROUP\n", " data group name in hdf5 file\n", " --result_dir RESULT_DIR\n", " Output directory\n", " --result_fname RESULT_FNAME\n", " Output file\n", " --min_p_prob MIN_P_PROB\n", " Probability threshold for P pick\n", " --min_s_prob MIN_S_PROB\n", " Probability threshold for S pick\n", " --mpd MPD Minimum peak distance\n", " --amplitude if return amplitude value\n", " --format FORMAT input format\n", " --stations STATIONS seismic station info\n", " --plot_figure If plot figure for test\n", " --save_prob If save result for test\n", "```\n", "\n", "## 3. Output picks\n", "- The output picks are saved to \"results/picks.csv\" on default\n", "\n", "|file_name |begin_time |station_id|phase_index|phase_time |phase_score|phase_amp |phase_type|\n", "|-----------------|-----------------------|----------|-----------|-----------------------|-----------|----------------------|----------|\n", "|2020-10-01T00:00*|2020-10-01T00:00:00.003|CI.BOM..HH|14734 |2020-10-01T00:02:27.343|0.708 |2.4998866231208325e-14|P |\n", "|2020-10-01T00:00*|2020-10-01T00:00:00.003|CI.BOM..HH|15487 |2020-10-01T00:02:34.873|0.416 |2.4998866231208325e-14|S |\n", "|2020-10-01T00:00*|2020-10-01T00:00:00.003|CI.COA..HH|319 |2020-10-01T00:00:03.193|0.762 |3.708662269972206e-14 |P |\n", "\n", "Notes:\n", "1. The *phase_index* means which data point is the pick in the original sequence. So *phase_time* = *begin_time* + *phase_index* / *sampling rate*. The default *sampling_rate* is 100Hz \n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 3. Read P/S picks\n", "\n", "PhaseNet currently outputs two format: **CSV** and **JSON**" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "import json\n", "import os\n", "PROJECT_ROOT = os.path.realpath(os.path.join(os.path.abspath(''), \"..\"))" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "fname NC.MCV..EH.0361339.npz\n", "t0 1970-01-01T00:00:00.000\n", "p_idx [5999, 9015]\n", "p_prob [0.987, 0.981]\n", "s_idx [6181, 9205]\n", "s_prob [0.553, 0.873]\n", "Name: 1, dtype: object\n", "fname NN.LHV..EH.0384064.npz\n", "t0 1970-01-01T00:00:00.000\n", "p_idx []\n", "p_prob []\n", "s_idx []\n", "s_prob []\n", "Name: 0, dtype: object\n" ] } ], "source": [ "picks_csv = pd.read_csv(os.path.join(PROJECT_ROOT, \"results/picks.csv\"), sep=\"\\t\")\n", "picks_csv.loc[:, 'p_idx'] = picks_csv[\"p_idx\"].apply(lambda x: x.strip(\"[]\").split(\",\"))\n", "picks_csv.loc[:, 'p_prob'] = picks_csv[\"p_prob\"].apply(lambda x: x.strip(\"[]\").split(\",\"))\n", "picks_csv.loc[:, 's_idx'] = picks_csv[\"s_idx\"].apply(lambda x: x.strip(\"[]\").split(\",\"))\n", "picks_csv.loc[:, 's_prob'] = picks_csv[\"s_prob\"].apply(lambda x: x.strip(\"[]\").split(\",\"))\n", "print(picks_csv.iloc[1])\n", "print(picks_csv.iloc[0])" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'id': 'NC.MCV..EH.0361339.npz', 'timestamp': '1970-01-01T00:01:30.150', 'prob': 0.9811667799949646, 'type': 'p'}\n", "{'id': 'NC.MCV..EH.0361339.npz', 'timestamp': '1970-01-01T00:00:59.990', 'prob': 0.9872905611991882, 'type': 'p'}\n" ] } ], "source": [ "with open(os.path.join(PROJECT_ROOT, \"results/picks.json\")) as fp:\n", " picks_json = json.load(fp) \n", "print(picks_json[1])\n", "print(picks_json[0])" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3.10.4 64-bit", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.4" }, "orig_nbformat": 4, "vscode": { "interpreter": { "hash": "31f2aee4e71d21fbe5cf8b01ff0e069b9275f58929596ceb00d14d90e3e16cd6" } } }, "nbformat": 4, "nbformat_minor": 2 }