-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Ansible Operator playbook / role getting executed twice after initial operator installation, without any modification in resources. #6250
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@nataraj24 Thanks for raising this issue! I did some brief digging on this and have a question - Could you share the link to the docs that you followed? Aside from that, looking at the logs from the first run: --------------------------- Ansible Task StdOut -------------------------------
TASK [start memcached] *********************************************************
task path: /opt/ansible/roles/memcached/tasks/main.yml:3
-------------------------------------------------------------------------------
{"level":"info","ts":1673948838.1551049,"logger":"proxy","msg":"Cache miss: apps/v1, Kind=Deployment, default/memcached-sample-memcached"}
{"level":"info","ts":1673948838.161831,"logger":"proxy","msg":"Injecting owner reference"}
{"level":"info","ts":1673948838.162155,"logger":"proxy","msg":"Watching child resource","kind":"apps/v1, Kind=Deployment","enqueue_kind":"cache.example.com/v1alpha1, Kind=Memcached"}
{"level":"info","ts":1673948838.1621983,"msg":"Starting EventSource","controller":"memcached-controller","source":"kind source: *unstructured.Unstructured"}
--------------------------- Ansible Task StdOut -------------------------------
TASK [memcached : Create Secret] ***********************************************
task path: /opt/ansible/roles/memcached/tasks/main.yml:33
-------------------------------------------------------------------------------
{"level":"info","ts":1673948838.4395416,"logger":"logging_event_handler","msg":"[playbook task start]","name":"memcached-sample","namespace":"default","gvk":"cache.example.com/v1alpha1, Kind=Memcached","event_type":"playbook_on_task_start","job":"7635955712780123854","EventData.Name":"memcached : Create Secret"}
{"level":"info","ts":1673948840.2461214,"logger":"proxy","msg":"Skipping, because gvk is blacklisted","GVK":"/v1, Kind=Secret"}
{"level":"info","ts":1673948840.2505968,"logger":"proxy","msg":"Injecting owner reference"}
{"level":"info","ts":1673948840.2508073,"logger":"proxy","msg":"Resource will not be watched/cached.","GVK":"/v1, Kind=Secret"}
{"level":"info","ts":1673948840.4567962,"logger":"logging_event_handler","msg":"[playbook task start]","name":"memcached-sample","namespace":"default","gvk":"cache.example.com/v1alpha1, Kind=Memcached","event_type":"playbook_on_task_start","job":"7635955712780123854","EventData.Name":"memcached : Create configmap"}
--------------------------- Ansible Task StdOut -------------------------------
TASK [memcached : Create configmap] ********************************************
task path: /opt/ansible/roles/memcached/tasks/main.yml:45
-------------------------------------------------------------------------------
{"level":"info","ts":1673948842.2737124,"logger":"proxy","msg":"Cache miss: /v1, Kind=ConfigMap, default/memcache-configmap"}
{"level":"info","ts":1673948842.2800245,"logger":"proxy","msg":"Injecting owner reference"}
{"level":"info","ts":1673948842.280217,"logger":"proxy","msg":"Watching child resource","kind":"/v1, Kind=ConfigMap","enqueue_kind":"cache.example.com/v1alpha1, Kind=Memcached"}
{"level":"info","ts":1673948842.2802489,"msg":"Starting EventSource","controller":"memcached-controller","source":"kind source: *unstructured.Unstructured"}
{"level":"info","ts":1673948842.9402254,"logger":"runner","msg":"Ansible-runner exited successfully","job":"7635955712780123854","name":"memcached-sample","namespace":"default"}
----- Ansible Task Status Event StdOut (cache.example.com/v1alpha1, Kind=Memcached, memcached-sample/default) -----
PLAY RECAP *********************************************************************
localhost : ok=3 changed=3 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
---------- It looks like it creates the child resources and establishes watches for these child resources. Looking at the second run logs: --------------------------- Ansible Task StdOut -------------------------------
TASK [start memcached] *********************************************************
task path: /opt/ansible/roles/memcached/tasks/main.yml:3
-------------------------------------------------------------------------------
{"level":"info","ts":1673948851.1720097,"logger":"proxy","msg":"Read object from cache","resource":{"IsResourceRequest":true,"Path":"/apis/apps/v1/namespaces/default/deployments/memcached-sample-memcached","Verb":"get","APIPrefix":"apis","APIGroup":"apps","APIVersion":"v1","Namespace":"default","Resource":"deployments","Subresource":"","Name":"memcached-sample-memcached","Parts":["deployments","memcached-sample-memcached"]}}
{"level":"info","ts":1673948851.4520702,"logger":"logging_event_handler","msg":"[playbook task start]","name":"memcached-sample","namespace":"default","gvk":"cache.example.com/v1alpha1, Kind=Memcached","event_type":"playbook_on_task_start","job":"1728165814478689862","EventData.Name":"memcached : Create Secret"}
--------------------------- Ansible Task StdOut -------------------------------
TASK [memcached : Create Secret] ***********************************************
task path: /opt/ansible/roles/memcached/tasks/main.yml:33
-------------------------------------------------------------------------------
{"level":"info","ts":1673948853.2526982,"logger":"proxy","msg":"Skipping, because gvk is blacklisted","GVK":"/v1, Kind=Secret"}
--------------------------- Ansible Task StdOut -------------------------------
TASK [memcached : Create configmap] ********************************************
task path: /opt/ansible/roles/memcached/tasks/main.yml:45
-------------------------------------------------------------------------------
{"level":"info","ts":1673948853.4712036,"logger":"logging_event_handler","msg":"[playbook task start]","name":"memcached-sample","namespace":"default","gvk":"cache.example.com/v1alpha1, Kind=Memcached","event_type":"playbook_on_task_start","job":"1728165814478689862","EventData.Name":"memcached : Create configmap"}
{"level":"info","ts":1673948855.2703552,"logger":"proxy","msg":"Read object from cache","resource":{"IsResourceRequest":true,"Path":"/api/v1/namespaces/default/configmaps/memcache-configmap","Verb":"get","APIPrefix":"api","APIGroup":"","APIVersion":"v1","Namespace":"default","Resource":"configmaps","Subresource":"","Name":"memcache-configmap","Parts":["configmaps","memcache-configmap"]}}
{"level":"info","ts":1673948855.9475477,"logger":"runner","msg":"Ansible-runner exited successfully","job":"1728165814478689862","name":"memcached-sample","namespace":"default"}
----- Ansible Task Status Event StdOut (cache.example.com/v1alpha1, Kind=Memcached, memcached-sample/default) -----
PLAY RECAP *********************************************************************
localhost : ok=3 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
---------- It looks like it notices that the child resources are already created and just continues along. This makes me think that the reconciliation loop was triggered again by some change to one of the children resources (potentially even a status update) and as such the Ansible playbook was run again. This is from my understanding completely normal behavior due to the way that the watches are established. |
Thank you @everettraven for the revert. Following is the link which i had followed for memcached ansible operator https://sdk.operatorframework.io/docs/building-operators/ansible/quickstart/ if you notice timestamp between difference first run playbook completion and second playbook initiation its in milliseconds. so there is no change in the objects / resources. First Playbook Run Completion: Tue Jan 17 15:17:22 IST 2023 Below are the full log statements. {"level":"info","ts":1673948842.9402254,"logger":"runner","msg":"Ansible-runner exited successfully","job":"7635955712780123854","name":"memcached-sample","namespace":"default"} ----- Ansible Task Status Event StdOut (cache.example.com/v1alpha1, Kind=Memcached, memcached-sample/default) ----- PLAY RECAP ********************************************************************* {"level":"info","ts":1673948848.1544585,"logger":"logging_event_handler","msg":"[playbook task start]","name":"memcached-sample","namespace":"default","gvk":"cache.example.com/v1alpha1, Kind=Memcached","event_type":"playbook_on_task_start","job":"1728165814478689862","EventData.Name":"start memcached"} |
So, I would say this is normal behavior - especially when something is first installed, it's common for things to happen in different orders, so things like cache misses can happen that cause several reconciliations to fire in a row. I wouldn't really be worried about this unless it was stuck in a loop, or causing performance issues. Is there a specific concern you have about this behavior? |
thank you @jberkhahn for revert. The behavior is not consistent as in our application ansible operator its running thrice. We are using blacklisting functionality of the operator for secrets. If an secret is rotated after installation because of this multiple runs the changes are getting over written. Logically also once the playbook is completed it shouldn't be executing it again. |
any resolution to overcome the above problem would be helpful |
So to make sure I am following, the problem here is that Secret changes are being reverted despite it being blacklisted?
As far as I am aware this isn't true in terms of how the ansible-operator works. It works by executing the playbook once for every reconciliation that is performed. The fact that this playbook is running more than a single time is not really an issue as like @jberkhahn mentioned, there are plenty of reasons that multiple reconciliations may occur after creating child resources, creating watches, etc. |
Hi Team, Request your help on the above issues |
Hi Team Kindly provide help on the issue |
/assign |
@nataraj24 We spoke about this issue in the ansible operator community meeting and determined that there isn't an issue here.
During the community meeting we discussed that just because a resource is blacklisted doesn't mean the resource can't be recreated during a normal reconciliation process, just that any changes to that resource type won't cause a reconciliation loop. This means your operator is likely recreating the Secret during a reconciliation process triggered by some other change, therefore overwriting any changes you've previously made to that Secret resource. To fix this, you would need to implement some kind of logic that checks to see if the Secret already exists and if it does then you don't overwrite it.
Again, I don't think this is an issue. The playbook is run any time reconciliation is triggered. There are numerous reasons why a reconciliation may be triggered resulting in multiple runs of the playbook. For the reasons mentioned above and that multiple people have mentioned that this is a non-issue in multiple community calls I am going to close this issue. If you encounter any other issues please feel free to open a new issue or reopen this issue if you feel it was closed in error. |
Uh oh!
There was an error while loading. Please reload this page.
Type of question
General operator-related help
Question
What did you do?
After installing sample memcached operator as explained in operator-sdk documentation. the sample playbook/role is executed twice even after successful completion of the playbook. Even for first time even if there is no change in any of resources.
What did you expect to see?
Playbook has to be executed only once.
What did you see instead? Under which circumstances?
Below are status of play book execution after first time installation. Full log is attached. memcached.log
kubectl logs deploy/memcached-operator-controller-manager -n memcached-operator-system -c manager -f | grep rescued
localhost : ok=3 changed=3 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
localhost : ok=3 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
Environment
Operator type:
/language ansible
Kubernetes cluster type:
Native K8s cluster
$ operator-sdk version
operator-sdk version: "v1.25.0", commit: "3d4eb4b2de4b68519c8828f2289c2014979ccf2a", kubernetes version: "1.25.0", go version: "go1.19.2", GOOS: "linux", GOARCH: "amd64"
$ go version
(if language is Go)$ kubectl version
Client Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.3", GitCommit:"1e11e4a2108024935ecfcb2912226cedeafd99df", GitTreeState:"clean", BuildDate:"2020-10-14T12:50:19Z", GoVersion:"go1.15.2", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.3", GitCommit:"1e11e4a2108024935ecfcb2912226cedeafd99df", GitTreeState:"clean", BuildDate:"2020-10-14T12:41:49Z", GoVersion:"go1.15.2", Compiler:"gc", Platform:"linux/amd64"}
Additional context
The text was updated successfully, but these errors were encountered: