Installing/Configuring the Time Series Database for Production

The Time Series Database (TSDB) tracks hundreds of parameters from the headers and extensions of L0/2D/L1/L2 files, including basic characteristics of the observations (exposure times, telescope pointing coordinates, etc.), telemetry, reduced RVs, Quality Control keywords, and diagnostic information. The TSDB has multiple uses including in production processing (the setup of of which is described here), for testing, and to package and transmit information about a small set of observations (e.g., the RVs for a single star or for an observing program). The production version of the TSDB ingests data from the L0/2D/L1/L2 files primarily through scripts/ingest_dates_kpf_tsdb.py, triggered by file creation or modification events. An extensive set of plots are generated by scripts/generate_time_series_plots.py.

The TSDB can be run in using a Postgres server or with SQLite, which does not require a server. For production processing, the Postres implementation is strongly recommended for stability, speed, and robustness.

Setting up the Postgres Server

To-do: add instructions for setting up the Postgres server. This page should mirror the Pipeline Operations Database.

Regenerating TSDB Tables

The standard practice for many databases when new columns are added to the schema is to alter the schema with the data in place. The TSDB code is written in a way that makes it easier to drop the tables, create them, and reingest all data. This is possible because the TSDB only ingests from L0/2D/L1/L2 files and is not written to during data processing by the DRP. The TSDB schema is defined by the contents of the .csv files in static/tsdb_tables/. Thus, updating the schema involves changes to the .csv files, dropping the database table, recreating them, and reingesting all data. (Data ingestion can take a few hours for years of KPF data.)

Another advantage of this strategy of reingesting data is that data from previous processing of the DRP can remain in the TSDB unless it is replaced by udpated data from the reprocessed data (this is what usually happens). If reprocessing fails (perhaps because of a bug in a pipeline version), the database will retain old and incorrect data. In the current implementation, there isn’t a script that checks if L0/2D/L1/L2 files have been removed and then updates the TSDB.

To drop tables, recreate them, and reingest data, the following commands should be executed in a notebook or other environment. Note that the drop_tables() should be executed with care and is only avaiable for database users with ‘superuser’ or ‘operations’ roles:

from database.modules.utils.tsdb import TSDB
myDB =  TSDB(backend='psql')
myDB.drop_tables()
myDB =  TSDB(backend='psql')
start_date = '20250701'
end_date   = '20250705'
myDB.ingest_dates_to_db(start_date, end_date)
myDB.print_db_status()

The example shows reingestion over a date range July 1-5, 2025. Those dates can be modified and/or the script scripts/ingest_dates_kpf_tsdb.py will ingest the full set of available KPF files when it is run.