Respect project visibility settings in the contributions calendar
This MR fixes a number of bugs relating to access controls and date selection of events for the contributions calendar
Closes https://gitlab.com/gitlab-org/gitlab-ce/issues/23403
See merge request !2019
Signed-off-by: Rémy Coutable <remy@rymai.me>
At the moment we cannot see weather a user left a project due to their
membership expiring of if they themselves opted to leave the project.
This adds a new event type that allows us to make this differentiation.
Note that is not really feasable to go back and reliably fix up the
previous events. As a result the events for previous expire removals
will remain the same however events of this nature going forward will be
correctly represented.
Per GitLab.com's performance metrics this method could take up to 5
seconds of wall time to complete, while only taking 1-2 milliseconds of
CPU time. Removing the Redis lease in favour of conditional updates
allows us to work around this.
A slight drawback is that this allows for multiple threads/processes to
try and update the same row. However, only a single thread/process will
ever win since the UPDATE query uses a WHERE condition to only update
rows that were not updated in the last hour.
Fixesgitlab-org/gitlab-ce#22473
In 8278b763d9 the default behaviour of annotation
has changes, which was causing a lot of noise in diffs. We decided in #17382
that it is better to get rid of the whole annotate gem, and instead let people
look at schema.rb for the columns in a table.
Fixes: #17382
By simply loading the first event from the already sorted set we save
ourselves extra (slow) queries just to get the latest update timestamp.
This removes the need for Event.latest_update_time and significantly
reduces the time needed to build an Atom feed.
Fixesgitlab-org/gitlab-ce#12415
Instead of using MAX(events.updated_at) we can simply sort the events in
descending order by the "id" column and grab the first row. In other
words, instead of this:
SELECT max(events.updated_at) AS max_id
FROM events
LEFT OUTER JOIN projects ON projects.id = events.project_id
LEFT OUTER JOIN namespaces ON namespaces.id = projects.namespace_id
WHERE events.author_id IS NOT NULL
AND events.project_id IN (13083);
we can use this:
SELECT events.updated_at AS max_id
FROM events
LEFT OUTER JOIN projects ON projects.id = events.project_id
LEFT OUTER JOIN namespaces ON namespaces.id = projects.namespace_id
WHERE events.author_id IS NOT NULL
AND events.project_id IN (13083)
ORDER BY events.id DESC
LIMIT 1;
This has the benefit that on PostgreSQL a backwards index scan can be
used, which due to the "LIMIT 1" will at most process only a single row.
This in turn greatly speeds up the process of grabbing the latest update
time. This can be confirmed by looking at the query plans. The first
query produces the following plan:
Aggregate (cost=43779.84..43779.85 rows=1 width=12) (actual time=2142.462..2142.462 rows=1 loops=1)
-> Index Scan using index_events_on_project_id on events (cost=0.43..43704.69 rows=30060 width=12) (actual time=0.033..2138.086 rows=32769 loops=1)
Index Cond: (project_id = 13083)
Filter: (author_id IS NOT NULL)
Planning time: 1.248 ms
Execution time: 2142.548 ms
The second query in turn produces the following plan:
Limit (cost=0.43..41.65 rows=1 width=16) (actual time=1.394..1.394 rows=1 loops=1)
-> Index Scan Backward using events_pkey on events (cost=0.43..1238907.96 rows=30060 width=16) (actual time=1.394..1.394 rows=1 loops=1)
Filter: ((author_id IS NOT NULL) AND (project_id = 13083))
Rows Removed by Filter: 2104
Planning time: 0.166 ms
Execution time: 1.408 ms
According to the above plans the 2nd query is around 1500 times faster.
However, re-running the first query produces timings of around 80 ms,
making the 2nd query "only" around 55 times faster.