EMU Test / Changes Detection (push) Has been cancelledDetails
EMU Test / Generate Verilog (push) Has been cancelledDetails
EMU Test / EMU - Basics (push) Has been cancelledDetails
EMU Test / EMU - CHI (push) Has been cancelledDetails
EMU Test / Docker Build (push) Has been cancelledDetails
EMU Test / EMU - Performance (push) Has been cancelledDetails
EMU Test / EMU - SimFrontend (push) Has been cancelledDetails
EMU Test / EMU - MC (push) Has been cancelledDetails
EMU Test / SIMV - Basics (push) Has been cancelledDetails
EMU Test / Upload Artifacts (push) Has been cancelledDetails
EMU Test / Check Submodules (push) Has been cancelledDetails
EMU Test / Check Format (push) Has been cancelledDetails
Meta of entries that have been redirected by backend may be overwritten
by BPU already while they remain in resolve queue. In this case, FTQ
will send BPU wrong meta.
Now it's marked on last instruction in current page, so ICache/Ifu will
not fetch next page.
Before this change, CI can pass as Ifu has fixed instr range when
fallthrough (!taken):
1bb82e92fd/src/main/scala/xiangshan/frontend/ifu/Ifu.scala (L265-L278)
After this change, maybe we can remove `s1_ftrRange` and simply relies
on `s1_takenCfiOffset` to decide range. CC @my-mayfly
- `prevInstrCount` is equal to next cycle's `numFromFetch`, which
calculated at ifu.s3,
- compare `prevInstrCount` with next cycle's number of invalid entries
(i.e. `numInvalidNext`),
- the answer is next cycle's ready (NOT considering dequeue behavior and
predChecker).
In v3 Ftq, we have `redirect` to Bpu and `redirectNext` to prefetch,
therefore bp1=pf0. In this case, when an s1 prediction is overridden by
s3, it can reach at most pf2/if1 (i.e. in prefetchPipe `s2` stage reg,
wayLookup `entries(writePtr - 1)`, or mainPipe `s1` stage reg). In old
design, we only flush pf1, that could be wrong.
Though, we don't have to flush prefetchPipe s2, anyway it's prefetch and
has no impact on control flow. (If we don't flush it, it can be seen as
some sort of wrong-path prefetch).
So, in this PR we implement wayLookup & mainPipe s1 flush from Bpu s3
override.
NOTE: To reduce implementation cost, wayLookup assumes that we can flush
at most 1 entry at tail, this could be wrong if we have Bpu s4/5/even
more flush in the future. See comments there.
When the some align bank is located on the next page,BPU may generate a
cross-page fetch block.
For example, [alignBank0, alignBank1], if alignBank1 located on the next
page, and alignBank0 has no taken branch, alignBank1 has a taken branch,
the fetch block which generated will be a cross-page fetch block.
TAGE should only be responsible for predicting the direction of
conditional branches; therefore, it now only outputs condTakenMask
instead of the takenMask for all branches.
The current WriteBuffer is only suitable for writing one way at a time,
but the tag table needs to write multiple ways at once. Therefore, we
are temporarily using a Queue to buffer the write requests.
Hopefully this will fix CI failure due to Upload Artifact (on GitHub
hosted runner, with only 16G mem) OOM:
<img width="858" height="88" alt="image"
src="https://github.com/user-attachments/assets/1e78ce7f-49e5-4b1c-8f9d-41d74fdf52a9"
/>
Local analysis by jvisualvm shows we need 8.8G heap mem at peak
(JVM_XMX=16G with default gc strategy), so maybe we can reduce JVM_MAX
from 16G to 10G to trigger gc more frequently, and use less mem.
And, we can clean up unused pre-installed tool to reclaim more space for
swap (refer to
https://github.com/actions/runner-images/issues/709#issuecomment-612569242),
though we're utilizing `/mnt` now and it seems the disk space is already
enough. Then we can increase swapfile size.
Also: fix typo