Skip to content

Conversation

@changlei-li
Copy link
Contributor

No description provided.


New API: `host.get_ntp_synchronized`, `host.set_server_localtime`

`host.get_ntp_synchronized` shows if the system time is synchronized with NTP
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be a implemented by a call to ntp_adjtime, and checking for TIME_ERROR, see https://man7.org/linux/man-pages/man2/adjtimex.2.html
(Note: do not check for TIME_OK, because there are a lot of leap-second specific return codes during normal operation, which are not errors).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another possibility is to use timedatectl show and look for NTPSynchronized


## Pool level set APIs

For a pool with large amount of hosts, provides a pool level set APIs which call
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about newly joined hosts in the pool? Might lead to inconsistent configuration if newly added hosts would have different configuration.
Might be useful to have a setting that, if the admin wants, allows newly joined hosts to synchronize its Host level NTP settings from the pool.

To make that easier then it'd be easier to have a struct with all the NTP settings that we can refer to. Unfortunately the IDL doesn't support structs, so the way to implement that would be to introduce a new class.

Then hosts could refer to that NTP setting class, and the pool could do the same: if the pool level NTP setting is empty then nothing happens, and if it is set then newly joined hosts would be set to that NTP class too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For newly joined hosts, they keep their own configuration. It's allowed hosts have different configurations. I want to keep the pool level APIs simple enough. They are provided just for convenience from requirement level. There isn't a corresponding pool field either.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not have all hosts use the configuration of the master? I see no reason why hosts should differ with respect to NTP.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i.e. move all these fields to the Pool object instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the pool level APIs concern much, I would try to contact PM to descope it. Originally, it is not the main point.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the name pool.set_xxx is misleading that it will manage the pool level config. Actually what we want is to provide an API to sync the hosts settings. How about rename them to pool.sync_xxx. Furthermore, can implement a general pool.sync_hosts xxx that can sync all host.xxx of hosts in the pool.


### Others

`host.enable/disable_ntp` starts or stops chronyd service.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Enabling NTP, or setting a new NTP server when there were no working ones before could require a large time jump.
IIRC NTP/chrony has some limits beyond which it is not feasible to slew/accelerate time to catch up and a jump would be required. How is that handled? (I think for chrony the setting would be makestep)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. The default chrony conf has maskestep item.

`host.get_ntp_synchronized` shows if the system time is synchronized with NTP
source.

`host.set_server_localtime` offers an API to set the local time when NTP disabled.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need a setting for leapsecmode? In some environments it may be preferable to avoid inserting leap seconds, and instead slowed down/sped up (slew) to accomodate.
This can avoid bugs in software that doesn't handle leap seconds well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is not in the scope.

`host.get_ntp_synchronized` shows if the system time is synchronized with NTP
source.

`host.set_server_localtime` offers an API to set the local time when NTP disabled.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also when NTP is enabled, attempting to set the time should raise an exception.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly yes.

Copy link
Member

@psafont psafont left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the API surface could be minimized further to avoid footguns. We need to be careful as well with the expectations of setting time pool-wide, this can't be simply implemented naively as it can be very problematic.

I would also ask you to consider dropping the mode that NTP is not used at all: this means that server time in a pool can drift considerably. I would recommend instead a mode where the server run NTP servers and synchronize between themselves. This means that they might not be accurate, but at least they will be synchronized.

The call to set time and get time need changes as well, depending on the previous decision, it might make more sense to use pool-wide calls for these.

- User can get the current time zone of the host via XenAPI
- User can add/remove custom NTP servers via XenAPI
- User can set custom NTP servers via XenAPI
- User can get the current custom NTP servers via XenAPI
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I were a user, I would like to know the ntp servers even if I use the default mode, or the DHCP.

I think there should be a read-only field that shows which NTP servers is the host using. I would also add a RPC to set the custom servers: Host.set_ntp_custom_servers

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

User can use host.get_ntp_servers_status to know the current ntp servers and their status.
There is Host.set_ntp_custom_servers in the design doc for user to set in custom mode.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

User can use host.get_ntp_servers_status to know the current ntp servers and their status.

Can't this information be displayed in the read-only field? I'd rather reduce the API surface and make it simpler to use than adding functions if it can be avoided

`host.get_ntp_synchronized` shows if the system time is synchronized with NTP
source.

`host.set_server_localtime` offers an API to set the local time when NTP disabled.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it accepts a localtime without timezone

As I said in the other PR, times without a timezone MUST NOT be accepted, under any cirmcumstance. This is because clients can feed timestamps anchored to any timezone, which is ambiguous and leads to mistake.

The current RPC that returns a timezonless timestamp is a legacy call. If it's needed for this work, it should be repurposed (I don't know of any clients that rely on the current behaviour), or a new call should be created that always return the time difference to UTC

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand your point about times without a timezone MUST NOT be accepted. For existing timedate type filed in XAPI DB, like host.last_software_update, pool.last_update_sync, times without a timezone will goes to a mess.
But in this feature, senario is different. The timezone APIs are added, host.set_server_localtime set time without timezone is intuitive. And also used in date -s, timedatectl set-time.
In fact, we cannot keep dom0 timezone in UTC. Both host-installer, xsconsole provides timezone setting UI.
The legacy host.get_server_localtime can be considered to add timezone info, like other commands date, timedatectl do.
From the implementation aspect, we need discuss if DataTime type can be reused. Because it always adds UTC for the time without timezone.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

host.set_server_localtime set time without timezone is intuitive.

I think we need to separate two different matters:

  • How API clients use the RPC (1)
  • How users use those API clients (2)

For (1) it's of the essence that there is no ambiguity, this means forcing a timezoned timestamp.
For (2), the clients can make an effort to provide users an intuitive interface. Xapi should support making it possible for clients to provide this intuitive interface for humans, but does not have to provide it iself.

For example, xe:

  • user inputs xe host-set-time 10:05
  • xe recognises the time lacks seconds and timezone, assumes 0 seconds.
  • xe fetches the current timezone for the server (+5:00) using an API (maybe host.get_server_localtime, which always returns timezone information along with the timestamp)
  • xe send the request host.set-time with the requested timestamp and timezone information.

I do think that DateTime can be used for this, I would like to remove the handling of timezoneless timestamps from it in the future. For 9.0 we can remove all producers of it, and in 10.0 the processing can be removed altogether, since we don't expect those future hosts to interact with 8.0 ones.


## Pool level set APIs

For a pool with large amount of hosts, provides a pool level set APIs which call
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a pool with large amount of hosts, provides a pool level set APIs which call

I recommend against this: it's very difficult to keep time between server synchronized, and it's very easy to get this call wrong. For example, making the calls in a serial way (which is the default in message_forwarding) can lead to some hosts being some seconds ahead of others in large pools. This is best left to customers that decide to go on this arduous path, or spend a considerable amount of time and effort to get this correctly. Seeing that this is only for convenience, I don't think we can provide enough assurances to get this call right. It's best to drop it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like https://github.com/xapi-project/xen-api/pull/6745/files#r2499366088, storing the configuration in a single place in the Pool object. That has no issues with larger pools then (on startup each member reads the config and applies it, if the coordinator is not reachable, then it'll keep using whatever chrony.conf it already had).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. If the pool level APIs concern much, I would try to contact PM to descope it. Originally, it is not the main point.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way, set_server_localtime is excluded for pool level APIs, which other set APIs are applied.

Comment on lines +42 to +44
`host.ntp_mode`, enum host_ntp_mode(ntp_mode_dhcp, ntp_mode_custom, ntp_mode_default)
`host.ntp_custom_servers`, string set
`host.ntp_enabled`, bool
Copy link
Member

@psafont psafont Nov 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think merging ntp_enabled and ntp_mode would simplify the control of the feature:

let ntp_mode = DHCP | Manual | Factory | Disabled

To enable ntp mode, users can call set_ntp_mode DHCP; set_ntp_mode Factory; or set_ntp_servers [ntp1.server.com; ntp2.server.com]. And to disable call set_ntp_mode Disabled

And to check whether NTP is enabled, and which mode it is, just call get_ntp_mode

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be Manual of string list, but we can't express that in the IDL.

- User can get status of current NTP servers via XenAPI
- User can get NTP sync status via XenAPI
- User can set the items above from pool level API
- User can set host’s local time when NTP is disabled via XenAPI
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest splitting the implementation into 2 phases: add the query capabilities first: this would unblock UI development, and provide useful diagnostic capabilities to existing users: host.get_ntp_servers_status, host.get_ntp_synchronized.
And then implement the other _set*/get* APIs.

Also they should all be fields in the DB, with getters automatically generated, otherwise API clients like XenCenter won't be able to use these efficiently (setters could be overriden as custom code that both sets the DB and takes some action on the host).

@changlei-li changlei-li force-pushed the private/changleli/add-ntp-doc branch from 6c68616 to 2bf0903 Compare November 7, 2025 02:29
@changlei-li
Copy link
Contributor Author

changlei-li commented Nov 7, 2025

I think the API surface could be minimized further to avoid footguns. We need to be careful as well with the expectations of setting time pool-wide, this can't be simply implemented naively as it can be very problematic.

pool-level set APIs don't contain time setting. Sorry I didn't show it explicitly in chapter Pool level set APIs but mentioned in use cases.

I would also ask you to consider dropping the mode that NTP is not used at all: this means that server time in a pool can drift considerably. I would recommend instead a mode where the server run NTP servers and synchronize between themselves. This means that they might not be accurate, but at least they will be synchronized.

"Dropping the mode that NTP is not used at all" You mean not to provide a disable-ntp command but a internal ntp server in the pool. That's a big change.

The call to set time and get time need changes as well, depending on the previous decision, it might make more sense to use pool-wide calls for these.

Your suggestion makes sense. By the internal NTP sever, we can set the server time, then the pool time will be synced. It's good to add mode like ntp_mode_internal in the future.
But not providing the ntp disable is a little arbitrary? If we keep the disable-ntp option, we need the local-time settings.

@psafont
Copy link
Member

psafont commented Nov 7, 2025

"Dropping the mode that NTP is not used at all" You mean not to provide a disable-ntp command but a internal ntp server in the pool. That's a big change.

yes, it's a big change, but it's important to keep it in mind if customers start seeing strange issues when using the manual mode, we might have to provide it as a more robust solution, and deprecate manual mode in the future.

Your suggestion makes sense. By the internal NTP sever, we can set the server time, then the pool time will be synced. It's good to add mode like ntp_mode_internal in the future.
But not providing the ntp disable is a little arbitrary? If we keep the disable-ntp option, we need the local-time settings.

Well, if there's an internal NTP server, the API pool.set_time can be provided, and the disable ntp can be disregarded. I do agree that's for the future, I doubt having such a feature can be done in time for 9.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants