Database Design for Tracking Solar Production

We had a solar system installed at our house this year. I’m excited to see how this performs, as our estimates and research shows this ought to be a good financial decision for us over time. While the hardware that came with the system includes some monitoring and reporting, I wanted to track things independently to be sure that I have the data. I know many of these companies might not be as prepared for an issue as I would like, and if they lose some of my historical data, I’m not sure they care.

I decided to set up a small database, which will need an import process along with reporting and this is the first in a series of posts on how I’m addressing the database design. In this post, I’ll look at the initial tables I created.

Estimated Production Table

My system included some estimated levels of production for the year, and I will be able to record the actual levels each day. I decided to track these two sets of data separately for a couple reasons.

First, the estimates are monthly, and they do not vary. While I could just stick this data in the same table, it’s a lot of wasted data. Not a lot of space, but still, I decided to be efficient here. The estimates I have are a total for each month, with the math done to give me a daily power level. I decided to create this table:

CREATE TABLE [dbo].[SolarPowerEstimate](
     [TrackingKey] [int] IDENTITY(1,1) NOT NULL,
     [trackingmonth] [tinyint] NULL,
     [estimate_month] [numeric](6, 2) NULL,
     [estimate_daily] [numeric](4, 2) NULL,
  CONSTRAINT [SolarPowerPK] PRIMARY KEY CLUSTERED 

(
     [TrackingKey] ASC

)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

) ON [PRIMARY]

GO

This table has a PK just to keep things simple, and then I have a month number, which tracks for which month I have an estimate. There should only be 12 months in this table, as the estimate is supposed to repeat each year. I included the numeric values for the month and daily levels.

The data in this table looks like this:

I can join this with my actual production to compare how well things are working.

Actual Production

For Actual production, there is a value for each day. As a result, I need a date and a numeric value. I decided to separate out the date into separate parts, as I can always combine those, but this is really a data warehouse structure for me and I want to quickly join this with my estimate. I also expect to do some reporting by month, so having the month separated out (and the year) is a quick way to join data without needing a function.

CREATE TABLE   [dbo].[solarpoweractual](
     [TrackingKey] [int] IDENTITY(1,1) NOT NULL,
     [trackingyear] [smallint] NULL,
     [trackingmonth] [tinyint] NULL,
     [trackingday] [tinyint] NULL,
     [actual_daily] [numeric](10, 4) NULL,
  CONSTRAINT [SolarPowerActualPK] PRIMARY KEY CLUSTERED 

(
     [TrackingKey] ASC

)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

) ON [PRIMARY]

GO

This table will be populated with numbers for the date parts and then the production value. Right now, I see data like this:

I’ll go over reporting and how I use this data in another post, but there is one more table I need for this system.

Staging Imports

I can download data daily, but I really don’t care about the flows of the data each day. The data is reported each 15 minutes, but that’s a bit granular for me. Instead, I want to download monthly data. If I do that, I get a row for each day of the month, but some days are 0 if they are in the future. The current day is also incomplete until the sun goes down, so I may need to update that data regularly.

Rather then try to parse the data and build a complex ETL process, I’m aiming for an ELT, with a T that moves data from a staging table to my actual table with an upsert process.

The csv I get from my monitoring system is a date and a numeric value, so I built a staging table like this:

CREATE TABLE [dbo].[SolarStaging](
     [Time] [date] NOT NULL,
     [System_Production_Wh] [varchar](50) NOT NULL

) ON [PRIMARY]

GO

My aim here is to truncate this table, load the entire CSV, and then transform data as needed.

Summary

That’s the basics of my solar tracking database. I have a place to land new data, a table for the estimates I have for each month of the year, and then a table that is essentially a fact table of actual values.

I’ll add more details on how I load data, as well as how to analyze the data over time.

M	T	W	T	F	S	S
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30