New Feature Idea ... need help if possible #659

SeanNazareth · 2025-07-10T21:43:12Z

SeanNazareth
Jul 10, 2025

Hi,

Powerwall-Dashboard is a great project. However, I'd love to use it to track a little more data. In my case, I have an older NEM1 system and I recently added more capacity with a non-export + battery system. The original system uses a separate inverter and is AC coupled, while the newer addon has the PV panels directly connected to a PW3. The monitoring on Powerwall-Dashboard currently shows the total power used by the house, and combined total solar from both systems. As a dashboard, that's pretty convenient. However, I'd also like to track separately the generation from each of the systems in a separate tile. Is there a way to do this from the data collected? It seems that if I use Mode4, I also collect string data (just from the new PV panels connected to the PW3). Is there a way to sum and graph that, and the current solar_total MINUS the PW3 strings? I didn't know if this would be possible, so I was looking for alternatives.

I've found that my original system can provide 5-minute production data in a CSV format, that I'm currently able to upload from it's monitoring (sunny webbox) and I've played with using telegraf to import it to influxdb. I've figured out how to parse the data and create the line protocol using telegraf. So, if I can figure out the format and the queries, I might be able to get the data from the original system by that method.

I'm not an expert with influx, and I've been poking around the implementation. I can see that influx defines a "retention policy" and then groups data within each policy. I've seen that the data from the powerwall is first collected in raw.http. with ~5s interval data, and that there seems to be a CQ that "resamples??" that data into minute(kwh), monthly, daily buckets. I also see an autogen bucket that I'm not 100% sure how it's used.

It appears that the CQs aggregate the "second" data from the "raw" bucket, and populates the "autogen, kwh, daily and monthly" buckets. While I was looking at the configuration of the data collection, I saw this in the SQL:
abs((1+battery_instant_power/abs(battery_instant_power))*battery_instant_power/2) AS from_pw,
abs((1-battery_instant_power/abs(battery_instant_power))*battery_instant_power/2) AS to_pw,
abs((1+site_instant_power/abs(site_instant_power))*site_instant_power/2) AS from_grid,
abs((1-site_instant_power/abs(site_instant_power))*site_instant_power/2) AS to_grid,
and I didn't understand the formula intent ... "1+battery_instant_power/abs(battery_instant_power))*battery_instant_power/2". Would you be able to provide any insight?

I figure my use case (if I use my CSV production data) would be similar to the tesla-history method. If I got the 5m data, I could then create the hourly and daily production from the old system, and then maybe use grafana to create the difference. So, I thought I'd try to understand what happens with that implementation, but I'm struggling to find the data that tesla-history puts into influx. When exploring the data to see how this is put together... I first tried to create an empty database, then load data using the tesla-history tool (I'm expecting ~5m granularity data). If I use the influxdb-viewer tool provided, I can't seem to find any data that I loaded from tesla-history.py. If I actually launch the powerwall-dashboard in mode 2, I'm expecting to see 5m granularity data, but I seem to see ~5s (with time variations between samples). I thought that by using a tesla-cloud, I'd be limited to seeing 5m granularity.

Could you help provide any "insight" into the data source (tesla cloud)? How is it that it seems to provide 5s data for immediate consumption, but ~5m data for longer periods? Is the data in the 5m or 5s intervals the average value over the interval? a median? or an instantaneous value?

As a separate question, when I use mode4 to connect to my TEG/PW3, I do see ~5s granularity data. When looking at the telegraf data, Is this really a 5s average or just an instantaneous reading that we assume is representative of a 5s average?

I haven't tried mode 3 yet, but do we expect any different metrics or behavior vs mode2?

Also, if during testing I want to dump all data (so I can start again), do you have a simple recommendation? I tried to use similar commands as used in the tesla-history to drop the "cloud" data, obviously adjusted for the different method of capture, but I kept finding that grafana could still find data to graph. Eventually, I just shutdown the stack, deleted the data directory for influxdb, and then re-setup the connection, but I was hoping there was a simpler way.

Thanks,

Sean

jasonacox · 2025-07-11T04:23:08Z

jasonacox
Jul 11, 2025
Maintainer

I may have a few things that can help...

abs((1+battery_instant_power/abs(battery_instant_power))*battery_instant_power/2) AS from_pw,
abs((1-battery_instant_power/abs(battery_instant_power))*battery_instant_power/2) AS to_pw,

This expression is a clever mathematical trick to split a value into its positive and negative components, based on its sign. It’s commonly used in systems like energy monitoring to distinguish between power flowing out of a battery and power flowing into it.

Positive when discharging (power flowing from battery).
Negative when charging (power flowing to battery).

For exploration help and to see how CQs are building retention policies and measurements, I built an InfluxDB viewer that turns the database into a file system you can navigate with cd, ls and cat commands: https://github.com/jasonacox/Powerwall-Dashboard/blob/main/tools/influxdb-viewer/README.md

# set up virtual python env
python -m venv .venv
source .venv/bin/activate

# install required libs
pip install requests colorama

# run viewer from Powerwall-Dashboard directory
python tools/influxdb-viewer/viewer.py

0 replies

SeanNazareth · 2025-07-12T20:47:03Z

SeanNazareth
Jul 12, 2025
Author

Hi Jason,

Thanks for your detailed response. I had been aware of your "viewer", and I had already been using it to try and understand the basic configurations of the data structures. Unfortunately, I didn't consider shutting down the "telegraf" portion of the stack that was getting "new" data ( with second granularity), and the implementation of the cat and tail commands this made it hard to see the "windows" of old data that I was downloading. Especially since I was looking in the wrong retention policy location...

I also hadn't understood very well the impact of the "retention policies", and their impact on the data storage. So, I wasn't always looking in the right place. I had assumed (incorrectly) that both tesla-history.py and the telegraf plugin were both putting data into "raw".

I had tried to manually connect to the influxdb and do some manual queries, but I didn't understand the structure of how to make a query until I explored how you implemented the influxdb-viewer application.

I had tried to look at the code from the "tesla-history" utility, and I didn't see them select a retention policy, and I assumed that they would likely write to the "raw" policy.... assuming that the existing CQs would then calculate and populate the appropriate hourly/daily/monthly measurements.

From your tools, I've now understood how I can query directly in influx, and filter by time so that I can see the specific windows I was testing. I've subsequently now learned that the 5-minute "historical" data is not stored in "raw", but rather stored in "autogen". And, I've seen what the tesla-history updates after adding the data. It seems like they run-ONCE some of the queries that would have been run by the CQs to update the "summary" measurements.

I do have a few questions that maybe you can help with:

looking at the raw "retention policy", it seems to have a duration of 72h. I was originally trying to load week old data into that policy... but influxdb wouldn't accept it. I assume, the data was older than the retention policy. Assuming you effectively want to load "old" data, then run some queries to calculate "summary" data and then allow a retention policy to clean out the source data: What would be the right way to do this?
When I look at the formulas to compute some of the "summary" data, I see a CQ like the following (shortened to only show "solar"):

CREATE CONTINUOUS QUERY cq_autogen ON powerwall BEGIN 
	SELECT mean(solar) AS solar, 
	INTO powerwall.autogen.:MEASUREMENT 
	FROM (SELECT  solar_instant_power AS solar  FROM raw.http) 
	GROUP BY time(1m), month, year fill(linear) END

This looks like it "down-samples" the data from raw.http.solar_instant_power to find the average (mean) in each 1m grouping and then populate the powerwall.autogen.http.solar with this data. Here's a question: I do see the modifier "fill(linear)", so this should linearly interpolate missing data. Looking at the source data, I see ~5s granularity data in raw.http, but I have seen some variation in timestamps. Assuming that I have only captured 3 samples in a particular one minute at offset :01 sec, :10s, :30sec, do you have any concept of what I would expect to see as the resampled output?

When using the tesla cloud and the "scheduled" telegraf module, standard input goes to raw, raw.http. seens to have ~5s granularity, and then the CQ generated autogen.http. seems to have 1m granularity.

However, if I use load "historical data" using tesla-history.py, then for the historical periods, I will see 5m granularity in autogen.http.. Either way the following CQ is used to update the "hourly" data.

CREATE CONTINUOUS QUERY cq_kwh ON powerwall 
              RESAMPLE EVERY 1m BEGIN 
              SELECT  integral(solar)/1000/3600 AS solar,  
              INTO powerwall.kwh.:MEASUREMENT FROM autogen.http 
              GROUP BY time(1h), month, year tz('America/Los_Angeles') END

Would you be able to provide any color this "phrase"
"SELECT integral(solar)/1000/3600 AS solar"
"integral(solar)" group by time(1hr) would imply to me we are "accumulating" the value of "solar" for 1hour,
"/1000" seems obvious as a conversion from Watts-h to kWh
but the 3600 seems confusing. I would think it might represent "seconds in 1hr", and so the implication would be that we would be summing solar data with "second resolution", but we've already established that autogen.http.solar typically has 1m granularity data, but may have 5m granularity data in historical data load cases.

I assume we use we use "integral" instead of "sum" to handle the various "granularity" of the source data, but I'm not sure how this works. Would you be able to share any insight?

Also, these queries come from the influxdb.sql, but ONLY 2 of them cq_kwh and cq_daily seem to have the tz hard-coded in the sql. I see that you use sed to change this in the setup.sh for each user... but I'm not sure I understand why these specific queries care. Could you please comment?

Thanks,

Sean

2 replies

mcbirse Jul 13, 2025
Collaborator

Hi @SeanNazareth

I strongly urge you to read through some of the old threads, as there is a lot of information that likely answers your questions.

For the tesla-history script, start from here for more info: #12 (comment)

Continue reading this thread if you are looking for more information on the intricacies of the InfluxDB and/or how the script works, as I wrote quite a few tech notes along way, e.g. in this post: #12 (comment)

Please also check this issue: #80
And tool: https://github.com/jasonacox/Powerwall-Dashboard/tree/main/tools/fixmonthtags
It will help with your understanding about InfuxDB tags (month/year), timezones and GROUP BY in the CQ's.

Also, I recommend try Grafana's "Explore" option to query the InfluxDB if you haven't already. Especially if you are making changes and/or importing data. Jason's InfluxDB viewer tool is great for low level analysis, but sometimes it can help to also visualise the data in Grafana. You can run any InfluxDB queries using Grafana "Explore" as well, including dropping measurements, simulating CQ type queries, etc.

Some other info that may help:

The raw retention policy is intended for "live" (per x second) type data, which is then downsampled via CQ's for longer retention and lower granularity.

If you're intending to import historical data from some other source, don't import it to the raw retention policy, since anything older than 3 days would get deleted immediately. You would likely want to use "autogen" and probably create a new measurement for your data. Or, create a new retention policy entirely (especially if your are using a common measurement name like "http" - it all depends how you are importing from your external data source).

For your info, the tesla-history script imports data into retention policies: autogen (energy data), grid (grid status), pod (optional: backup reserve percentage). Also, all data points written to InfluxDB by the script will be tagged with "source=cloud" which allows for easy deletion/reversal of changes.

After you've read through the threads I've linked to, let us know if you have any other specific questions. I'll try to come back and address your other questions later.

jasonacox Jul 14, 2025
Maintainer

What @mcbirse said! :-)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

New Feature Idea ... need help if possible #659

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments 2 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

New Feature Idea ... need help if possible #659

Uh oh!

SeanNazareth Jul 10, 2025

Replies: 2 comments · 2 replies

Uh oh!

jasonacox Jul 11, 2025 Maintainer

Uh oh!

SeanNazareth Jul 12, 2025 Author

Uh oh!

mcbirse Jul 13, 2025 Collaborator

Uh oh!

jasonacox Jul 14, 2025 Maintainer

SeanNazareth
Jul 10, 2025

Replies: 2 comments 2 replies

jasonacox
Jul 11, 2025
Maintainer

SeanNazareth
Jul 12, 2025
Author

mcbirse Jul 13, 2025
Collaborator

jasonacox Jul 14, 2025
Maintainer