Skip to content

Proposal: User <-> repo unit cross table for easier permission checking. #9613

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
guillep2k opened this issue Jan 5, 2020 · 4 comments
Open
Labels
type/proposal The new feature has not been accepted yet but needs to be discussed first.

Comments

@guillep2k
Copy link
Member

I propose to create a new table (user_repo_units) to summarize all permissions a user requires to access any repository; this table should be derived from all the possible permission sources (e.g. team membership, admin status, repo visibility, repo active units, etc.). Any changes in the system that modifies the user's accessibility to any repo should update said table. Then, many queries can use the table for user access checking instead of a complex set of golang-side code.

The table would have a structure like:

type UserRepoUnits struct {
	UserID      int64       `xorm:"pk"`
	RepoID      int64       `xorm:"pk INDEX"`
	Type        UnitType
	Mode        AccessMode
}

Units accessible for all users (e.g. UnitTypeCode on public repositories) will not have a record for each user but one for the user 0 instead (may be -1 is better?). This should reduce the number of records in the table considerably.

Advantages

  • Access checks would be centralized in a few functions, much easier to maintain.
  • Most queries can be simplified considerably, especially for the home page.
  • It will improve the performance of the more frequent actions (e.g. home page, issue list, etc), rather than the infrequent permission modifications.
  • Bulk processes (like notifications) can retrieve a valid list of users with a single query instead of multiple complex steps.
  • Simplified access checks means less likely to leak information inadvertently.

Usage

The table can then be used in queries. For example, to get all open issues (UnitTypeIssues = 2) the user 1234 can see (AccessModeRead = 1), we could do:

select issue.*
from issue
where (select max(mode) from user_repo_units
        where user_repo_units.repository_id = issue.repository_id
          and user_repo_units.user_id in (1234, 0)
          and user_repo_units.type = 2) >= 1;
  and issue.is_closed = false;

No team membership check required, no ownership or admin check required. All in the same (pretty much standardized) query.

Alternatives

  • We could use a column for each unit type instead of separate records. This would reduce the number of records (and speed up the queries even more) at a little additional cost on maintainability.
@lunny lunny added the type/proposal The new feature has not been accepted yet but needs to be discussed first. label Jan 6, 2020
@lunny
Copy link
Member

lunny commented Jan 6, 2020

And we could also support collabrators unit permissions if this. Currently only organization's repositories could have unit permissions, but not individual's.

And this may result in the inconsistent between team_unit and user_repo_units. But even that, I also support to add the extra table to do that. We have to be careful of the consistent of database.

@sapk
Copy link
Member

sapk commented Jan 6, 2020

Couldn't we define a view wich will be optimize by the SQL engine instead of a table that need to be tightly kept uptodate ?

@guillep2k
Copy link
Member Author

guillep2k commented Jan 7, 2020

@sapk That would be a very complex view and will not likely be much optimizable. Views only translate as subqueries; they don't have any kind of pre-processing.

There's however the concept of a materialized view (a.k.a. indexed view), which is a read-only table automatically maintained by the rdbms based on a view definition. Those would be very convenient here, but I don't think they're available on all of our supported databases (not in sqlite3) and they have some important limitations and caveats as well, like not being immediately updated (e.g. in PostgreSQL a REFRESH MATERIALIZED VIEW mymatview; statement must be ran in order to update its contents).

This SO answer explains how views can improve a query performance, but only if they are indexed (materialized) views.

@sapk
Copy link
Member

sapk commented Jan 7, 2020

@guillep2k The optimization (cache) could be done on gitea part like using xorm cache over the view and clearing it when rights are changed. I am not against the table, I just fear that it would become a hazardous code to maintain so if it could be a more simple solution it would be better. In fact, we can go first with the table and later move to a view and just remove the code that updated the table.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/proposal The new feature has not been accepted yet but needs to be discussed first.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants