How to Safely Perform Backfill Operations in TimescaleDB
Backfilling data into a TimescaleDB hypertable in production can be very tricky, especially when automated processes like compression policies are involved. From past experience, we have seen that if backfill operations aren’t handled properly, they can interfere with these automated tasks, sometimes causing them to stop working altogether. This blog covers a safer and more reliable approach to backfilling hypertables, along with best practices to prevent disruptions to compression and other background processes.What is a Backfill Operation?Backfilling means adding old or missing data into the database table after some time has already passed. Imagine you are collecting temperature readings every hour, but your system was down for a day and didn’t save any data. Later, you get that missing data from the local storage of the device or cloud storage, and want to put it back in the right hypertable, which is called backfilling. In TimescaleDB, this is common with time-series data, but it needs to be done carefully. That’s because TimescaleDB might already be doing things in the background, like compressing old data to save space. If we are not careful, backfilling can mess up these automatic tasks.