Which Capacity should we use?

Which capacity of the PV system should we use? 
It is useful to normalise the PV data ready for ML and using the capacity makes sense. 
Importantly the training and prediction must use the same. Hence #691 

The metadata for pvoutput,org and Passiv provide capacity values. 
Note that The pv capacity can degrade over time, so a static one might not be so good

# 1. Use the maximum values of the training set

Pros:
- the system will be between 0 and 1 inclusive
- ML model can be PV system agnostic, as data will be between 0 and 1. 

Cons:
- the predictions data might not be between 0 and 1 if not data in training (unlikely)
- I did this with the first CNN model and had to increase lots of capacities of ~10 systems, and reduce lots of capacities of ~10 systems

# 2. Use metadata

Pros: 
- This number is constant, its given to us

Cons:
- It could be way off the actual power produced. 

# 3. Hybrid 1
- use 1
- adjust capacity if prediction data > 100%
- adjust capacity if prediction data < 50% (over a hsitory of collecting it live)

# 4. Hybrid 2
- use 2 
- If training data < 50%, then use 1
- If training data > 100% then use 1
- If prediction data > 100% then use 1
- If over a week of data, (which includes good sunny days) data < 50%, then use 1
(These lower and upper bounds could change)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Which Capacity should we use? #692

1. Use the maximum values of the training set

2. Use metadata

3. Hybrid 1

4. Hybrid 2

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Which Capacity should we use? #692

Description

1. Use the maximum values of the training set

2. Use metadata

3. Hybrid 1

4. Hybrid 2

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions