XiangShan

Commit Graph

Author	SHA1	Message	Date
Anzo	b4473bd3f0	fix(LSU): misalign exception are generated directly within the pipeline (#4757 ) EMU Test / Changes Detection (push) Waiting to run Details EMU Test / Generate Verilog (push) Blocked by required conditions Details EMU Test / EMU - Basics (push) Blocked by required conditions Details EMU Test / EMU - CHI (push) Blocked by required conditions Details EMU Test / Docker Build (push) Blocked by required conditions Details EMU Test / EMU - Performance (push) Blocked by required conditions Details EMU Test / EMU - MC (push) Blocked by required conditions Details EMU Test / SIMV - Basics (push) Blocked by required conditions Details EMU Test / Upload Artifacts (push) Blocked by required conditions Details EMU Test / Check Submodules (push) Blocked by required conditions Details EMU Test / Check Format (push) Blocked by required conditions Details Release Jobs / build-xsdev-image (push) Waiting to run Details Previously, when accessing mmio space after splitting, we generated an exception in misalignbuffer, which prevented the exception address from being written to exceptionbuffer. Therefore, we now choose to generate an exception in pipeline so that the exception address can be written correctly.	2025-06-03 11:23:22 +08:00
Anzo	8769efd7b4	fix(LoadUnit): no longer allow tlb missing misaligned load to enter misalignbuffer (#4760 ) EMU Test / Changes Detection (push) Has been cancelled Details Release Jobs / build-xsdev-image (push) Has been cancelled Details EMU Test / Generate Verilog (push) Has been cancelled Details EMU Test / EMU - Basics (push) Has been cancelled Details EMU Test / EMU - CHI (push) Has been cancelled Details EMU Test / Docker Build (push) Has been cancelled Details EMU Test / EMU - Performance (push) Has been cancelled Details EMU Test / EMU - MC (push) Has been cancelled Details EMU Test / SIMV - Basics (push) Has been cancelled Details EMU Test / Upload Artifacts (push) Has been cancelled Details EMU Test / Check Submodules (push) Has been cancelled Details EMU Test / Check Format (push) Has been cancelled Details Misaligned load that cause TLB miss are no longer allowed to enter the loadmisalignbuffer. This is because it would cause subsequent exception addresses to be incorrect : The original address of the misaligned load is 0x07, while the first request address after splitting is 0x00. Thus, when a page fault exception occurs, 0x00 would be considered the exception address instead of the original 0x07. --- Store have always been handled this way and do not require modification.	2025-05-31 16:38:27 +08:00
Anzo	14608211a7	fix(LoadUnit): misaligned exception addr should use split addr (#4751 )	2025-05-30 15:27:13 +08:00
Yanqin Li	076c8dd2db	fix(NCMM): nc access in main memory should not skip difftest (#4704 ) Bug Trigger: In a self-modifying program, the program modifies its own instructions in a region where PBMT=NC and PMA=MM. If difftest is skipped in this case, NEMU will not execute the corresponding memory access instruction. This causes NEMU and DUT to execute different instructions later on, ultimately leading to an error. Solution: For regions where PBMT=NC and PMA=MM, difftest should not be skipped, since PMA=MM indicates that NEMU can perform normal synchronization. However, for regions with PMA=IO, difftest should still be skipped because NEMU might not be able to access the corresponding devices. Instruction self-modification in PMA=IO regions is generally not a concern, as such regions are typically non-writable. Therefore, synchronization of self-modifying IO instructions is not handled here (as doing so would be overly complex).	2025-05-20 16:02:04 +08:00
Anzo	9c021a8c49	fix(LoadUnit): preventing raw jams caused by misalignment (#4674 )	2025-05-10 02:18:50 +08:00
Anzo	d76526f545	fix(LoadUnit): misaligned exception addr should use split addr (#4673 )	2025-05-10 02:18:26 +08:00
Anzo	6a3636fd23	fix(LoadUnit): perfetch no longer generates nc access (#4636 ) EMU Test / Changes Detection (push) Waiting to run Details EMU Test / Generate Verilog (push) Blocked by required conditions Details EMU Test / EMU - Basics (push) Blocked by required conditions Details EMU Test / EMU - CHI (push) Blocked by required conditions Details EMU Test / EMU - Performance (push) Blocked by required conditions Details EMU Test / EMU - MC (push) Blocked by required conditions Details EMU Test / SIMV - Basics (push) Blocked by required conditions Details EMU Test / Upload Artifacts (push) Blocked by required conditions Details EMU Test / Check Submodules (push) Blocked by required conditions Details EMU Test / Check Format (push) Blocked by required conditions Details We no longer allow software prefetch requests to generate nc signals, thus preventing weirdness in LoadUnit.	2025-04-29 16:47:00 +08:00
cz4e	bd1e467399	fix(LoadUnit, LSQ): fix report exception type for hardware error (#4619 ) * mmio or nc should report `Hardware Error` when response with `nderr` * loadunit should report `Hardware Error` when it should be `delay kill` from fast replay	2025-04-29 16:34:46 +08:00
Huijin Li	efee2982bb	fix(LoadUnit): fix ldld && stld query revoke logic (#4580 ) EMU Test / Changes Detection (push) Has been cancelled Details EMU Test / Generate Verilog (push) Has been cancelled Details EMU Test / EMU - Basics (push) Has been cancelled Details EMU Test / EMU - CHI (push) Has been cancelled Details EMU Test / EMU - Performance (push) Has been cancelled Details EMU Test / EMU - MC (push) Has been cancelled Details EMU Test / SIMV - Basics (push) Has been cancelled Details EMU Test / Upload Artifacts (push) Has been cancelled Details EMU Test / Check Submodules (push) Has been cancelled Details EMU Test / Check Format (push) Has been cancelled Details The prior design reassigns `io.lsq.ldin.bits.rep_info.need_rep` to 0 when source comes from MisalignBuffer, preventing cancellation of rar/raw enqueue requests during misaligned instruction reissuance. Thus, we must use `io.misalign_ldout.bits.rep_info.need_rep` to determine whether to revoke rar/raw enqueue requests when source is from MisalignBuffer.	2025-04-18 12:32:07 +08:00
Anzo	35bb77967d	fix(LSU): fix exception for misalign access to `nc` space (#4526 ) For misaligned accesses, say if the access after the split goes to `nc` space, then a misaligned exception should also be generated. Co-authored-by: Yanqin Li <maxpicca@qq.com>	2025-04-14 07:24:32 +08:00
cz4e	4ec1f46275	timing(StoreMisalignBuffer): fix misalign buffer enq timing (#4493 ) EMU Test / Changes Detection (push) Waiting to run Details EMU Test / Generate Verilog (push) Blocked by required conditions Details EMU Test / EMU - Basics (push) Blocked by required conditions Details EMU Test / EMU - CHI (push) Blocked by required conditions Details EMU Test / EMU - Performance (push) Blocked by required conditions Details EMU Test / EMU - MC (push) Blocked by required conditions Details EMU Test / SIMV - Basics (push) Blocked by required conditions Details EMU Test / Upload Artifacts (push) Blocked by required conditions Details EMU Test / Check Submodules (push) Blocked by required conditions Details EMU Test / Check Format (push) Blocked by required conditions Details * a misalign store will enqueue misalign buffer at s1, and revoke if it needs at s2	2025-04-09 17:53:23 +08:00
Yan Xu	1592abd11e	feat: support inst lifetime trace (#4007 ) PerfCCT(performance counter commit trace) is a Instruction-level granularity perfCounter like GEM5 How to use this: 1. Make with "WITH_CHISELDB=1" argument 2. Run with "--dump-db --dump-select-db lifetime", then get the database 3. Instruction lifetime visualize run "python3 scripts/perfcct.py "the-db-file-path" -p 1 -v \| less" 4. Analysis script now is in XS-GEM5 repo, see https://github.com/OpenXiangShan/GEM5/blob/xs-dev/util/ClockAnalysis.py How it works: 1. Allocate one unique tag "seqNum" like GEM5 for each instruction at fetch stage 2. Passing the "seqNum" in each pipeline 3. Recording perf data through the DPIC interface	2025-04-08 11:21:04 +08:00
Anzo	83e1708387	fix(LoadUnit): not enter misalignbuffer on exception (#4477 )	2025-04-01 14:12:52 +08:00
Yanqin Li	0b8a9d16b9	fix(LDU): only selected can be used in address mux (#4466 )	2025-03-28 11:37:54 +08:00
Anzo	dac94c4957	fix(LoadUnit): uncache should not be generated when page fault (#4442 ) As the comment says, even if a `PF` is generated, an address is still generated for `PMP/PMA` checking, which can lead to some strange responses. Since the previous(https://github.com/OpenXiangShan/XiangShan/pull/4426) modification removed `s2_exception`, this resulted in the incorrect generation of `s2_uncache`. This is now represented using clearer semantics: `s2_actually_uncache`: this real physical address is for uncache space. The `s2_uncache` has been retained to distinguish if it's a request from prefetching, which may be handled in a subsequent change to YQ senior sister. I synchronised the changes to StoreUnit in this pr(https://github.com/OpenXiangShan/XiangShan/pull/4441).	2025-03-20 19:39:14 +08:00
Anzo	bbed9f8de9	fix(LoadUnit): fix misalign exception and clearer uncache semantics (#4426 ) The loadAddrMisaligned exception is generated when misaligned accesses uncache space. --- A misaligned load sets a loadAddrMisaligned exception at the s0 flag to ensure that it only enters the loadmisalignbuffer and has no other side effects. So it will prevent s2_uncache from spawning properly. Previously we used an additional `s2_un_misalign_exception` to flag this. Now, after examining the semantics of s2_uncache, the semantics of s2_uncache can be appropriately represented by directly removing the excepiont related signals	2025-03-17 14:00:10 +08:00
Anzo	522c7f99f1	fix(LSU): misaligned violation detection stuck (#4369 ) Since a load instruction that cross 16Byte needs to be split and accessed twice, it needs to enter the `RAR Queue` twice, but occupies only one `virtual load queue`, so in the extreme case it may happen that 36 load instructions that span 16Byte fill all 72 `RAR queues`. --- There is some problem with our previous handling; if the oldest load instruction spanning 16Byte enters the `replayqueue` and at the same time there exists an instruction in the `loadmisalignbuffer` that can't finish executing because the `RAR Queue` is full, then the oldest load instruction is never cannot be issued because the `loadmisalignbuffer` has instructions in it all the time. --- Therefore, we use a more violent scheme to do this. When the RAR is full, we let the misaligned load generate a rollback, and the next load instruction that the loadmisalignbuffer can receive must be the oldest (if it is misaligned).	2025-03-07 11:50:50 +08:00
cz4e	90f8d3cfc2	fix(LoadUnit): exclude prefetch requests (#4367 ) * In order to ensure timing, the RAR enqueue conditions need to be compromised, worst source of timing from `pmp` and `missQueue`. * if `LoadQueueRARSize` == `VirtualLoadQueueSize`, just need to exclude prefetching. * if `LoadQueueRARSize` < `VirtualLoadQueueSize`, need to consider the situation of `s2_can_query`	2025-03-06 19:02:30 +08:00
Anzo	25381b72d6	fix(LoadUnit): misalign wakeup should not set s0 valid (#4359 ) `s0_src_valid_vec` is not `s0_src_select_vec`, and bit corresponding to `s0_src_valid_vec` is valid when any of the inputs `valid`. Therefore, `misalign wakeup` needs to globally control `s0_valid`.	2025-03-05 14:40:25 +08:00
Anzo	7ea48366e4	fix(LoadUnit): misalign load wakeup not enter loadunit (#4333 )	2025-03-03 15:22:24 +08:00
cz4e	0d55e1db4c	timing(LoadQueueRAR, LoadUnit): adjust rar/raw query logic (#4297 ) * Because of `LoadQueueRARSize == VirtualLoadQueueSize`, so no need to add additional logic for rar enq * When no need fast replay, loadunit allocate raw entry	2025-02-28 11:09:04 +08:00
Yanqin Li	66e9b546ea	fix(LDU): nc is also not mis-aligned (#4326 )	2025-02-27 23:10:07 +08:00
cz4e	99ce5576f0	style(Bundles): rewrite bundles with new style (#4274 )	2025-02-20 10:52:42 +08:00
Yanqin Li	48f7f553b3	fix(LDU): only tlb hit can use tlb resp (#4293 )	2025-02-20 10:36:04 +08:00
Anzo	5a36f63d70	fix(LoadUnit): corrupt should be triggered on valid mshr (#4292 )	2025-02-20 10:35:12 +08:00
Yanqin Li	638f3d8429	fix(uncache): uncache load fails to replay (#4275 ) Fixed the situation where the nc_with_data was not replayed correctly.	2025-02-17 11:31:36 +08:00
cz4e	ccde5272a6	fix(LoadUnit): fix misalign load wrong wakeup (#4263 ) when `io.dcache.req.ready` is false, misalign load will be stall, but `wakeup` still work normally and is not canceled in `s3`, which will cause the backend to get wrong data.	2025-02-16 17:38:10 +08:00
cz4e	9e12e8edb2	style(Bundles): move bundles to Bundles.scala (#4247 )	2025-02-09 01:03:37 +08:00
Anzo	faeef3281c	fix(LoadUnit): `dcache_kill` if `prf_wr` has no permissions (#4226 ) `prefetch.w` sends a write request to `TLB/PMA/PMP`. As a result, `PMA/PMP` returns a permission check (`io.pmp.st`) for the write request. --- Previously, we only handled the case where `prefetch.r` did not have read permissions, not handled the case where `prefetch.w` did not have write permissions. So, when `prefetch.w` has an address without write permissions, the request will still be sent to `Dcache`, which generates an error. This pr fixes that, when `PMA/PMP` returns `io.pmp.st`, we generate `dcache.s2_kill`.	2025-01-27 21:48:54 +08:00
Yanqin Li	74050fc0c8	perf(Uncache): add merge policy when entering (#4154 ) # Background ## Problem How to design a more efficient entry rule for a new load/store request when a load/store with the same address already exists in the `ubuffer`？ * Old Design: Always reject the new request. * New Design: Consider merging requests. ## Merge Scenarios ‼️If the new one can be merge into the existing one, both need to be `NC`. 1. New Store Request: 1. Existing Store: Merge (the new store is younger). 2. Existing Load: Reject. 2. New Load Request: 1. Existing Load: Merge (the new load may be younger or older. Both are ok to merge). 2. Existing Store: Reject. # What this PR do? ## 1. Entry Actions 1. Allocate a new entry and mark as `valid` 1. When there is no matching address. 2. Allocate a new entry and mark as `valid` and `waitSame`: 1. When there is a matching address, and: * The virtual addresses and attributes are the same. * The older entry is either selected to issue or issued. 3. Merge into an Existing Entry: 1. When there is a matching address, and: * The virtual addresses and attributes are the same. * The older entry is not selected to issue or issued. 4. Reject the New Request: 1. When the ubuffer is full. 2. When there is a matching address, but: * The virtual addresses or attributes are different. NOTE: According to the definition in the TL-UL SPEC, the `mask` must be continuous and naturally aligned, and the `addr` must correspond to the mask. Therefore, the "same attributes" here introduces a new condition: the merged `mask` must meet the requirements of being continuous and naturally aligned (function `continueAndAlign`). During merging, the block offset of addr must be synchronously updated in `UncacheEntry.update`. ## 2. Handshake Mechanism Between `LoadQueueUncache (M)` and `Uncache (S)` > `mid`: master id > > `sid`: slave id Old Design: - `M` sends a `req` with a `mid`. - `S` receives the `req`, records the `mid`. - `S` sends a `resp` with the `mid`. - `M` receives the `resp` and matches it with the recorded `mid`. New Design: - `M` sends a `req` with a `mid`. - `S` receives the `req` and responds with `{mid, sid}` . - `M` matches it with the `mid` and updates its record with the received `sid`. - `S` sends a `resp` with the its `sid`. - `M` receives the `resp` and matches it with the recorded `sid`. Benefit: The new design allows `S` to merge requests when new request enters. ## 3. Forwarding Mechanism Old Design: Each address in the `ubuffer` is unique, so forwarding is straightforward based on a match. New Design: * A single address may have up to two entries matched in the `ubuffer`. * If it has two matched enties, it must be true that one entry is marked `inflight` and the other entry is marked `waitSame`. In this case, the forwarded data comes from the merged data of two entries, with the `inflight` entry being the older one. ## 4. Bug Fixes 1. In the `loadUnit`, `!tlbMiss` cannot be directly used as `tlbHit`, because when `tlbValid` is false, `!tlbMiss` can still be true. 2. `Uncache` state machine transition: The state indicating "able to send requests" (previously `s_refill_req`, now `s_inflight`) should not be triggered by `reqFire` but rather by `acquireFire`. <img width="747" alt="image" src="https://github.com/user-attachments/assets/75fbc761-1da8-43d9-a0e6-615cc58cefef" /> # Evaluation - ✅ timing - ✅ performance \| Type \| 4B1000 \| Speedup1-IO \| 1B4096 \| Speedup2-IO \| \| -------------- \| ------- \| ----------- \| ------- \| ----------- \| \| IO \| 51026 \| 1 \| 208149 \| 1.00 \| \| NC \| 42343 \| 1.21 \| 169248 \| 1.23 \| \| NC+OT \| 20379 \| 2.50 \| 160101 \| 1.30 \| \| NC+OT+mergeOpt \| 16308 \| 3.13 \| 126369 \| 1.65 \| \| cache \| 1298 \| 39.31 \| 4410 \| 47.20 \|	2025-01-26 18:04:11 +08:00
cz4e	fa5e530d3c	timing(VSegmentUnit): duplicate latchVAddr (#4209 ) * `latchVAddr` needs to index all dcache data sram from top to bottom, which causes a large fanout, so duplicate `latchVaddr`	2025-01-21 18:46:55 +08:00
cz4e	0b4afd3490	timing(LoadUnit): optimization load unit writeback data generate logic (#4167 ) optimization load unit writeback data generate logic * merge multi source data at `s2`, select and expand data at `s3` * select data use one-hot instead of shifter	2025-01-15 11:13:37 +08:00
cz4e	37f33e11bc	timing(LoadUnit): fpWen and pdest reg out (#4144 ) when loadunit writeback * fpWen uses register directly out * pdest uses register directly out	2025-01-13 14:30:06 +08:00
Anzo	1021e139ea	fix(LoadUnit): `fast replay` no longer requests to `RAR/RAW Queue` (#4149 ) For `fast replay`, there is no need to request access to the `RAW/RAW Queue`. This prevents the `RAW Queue` from constantly ping-ponging between `not full/full` due to `revoke`. These two lines were removed because it would lead to combinatorial logic loops and it was an unwanted condition: `dfc474ebe1/src/main/scala/xiangshan/mem/pipeline/LoadUnit.scala (L1269-L1270)` --- This may result in some performance gains.	2025-01-09 16:04:54 +08:00
Anzo	dfc474ebe1	style(LoadUnit): removes redundant 'fast_rep_out' assignments (#4148 )	2025-01-09 14:27:37 +08:00
Anzo	da51a7acf9	fix(VLSU): fix vector exception writeback to 'MergeBuffer' logic (#4137 ) Fixed the bug of abnormal signal loss when writing back. Previously, we expected to compare only the ports of the writebacks that triggered the exception and pick the oldest. But amazingly, I just realised that the implementation doesn't match the annotation. The current implementation can be problematic in that if the write-back port that did not have an exception is older, the port that triggered the exception is not elected. Use s3_exception to try to optimise timing.	2025-01-07 09:50:11 +08:00
Anzo	c75efc00f0	fix(LoadUnit): use `lqIdx` to determine age (#4136 ) 1. `lqIdx` has less bit width. 2. for vectors, the `robIdx` is the same for multiple `flow`s. Previously, for vectors, we would additionally use `uopIdx` for judgement. But actually, in theory, we only need to use `lqIdx/sqIdx`. Here we change the age judgement for vectors to `lqIdx` to ensure accurate age judgement. And change the age judgement of scalar to `lqIdx` as well to reduce the cost.	2025-01-07 09:49:12 +08:00
Anzo	370a252d94	fix(LoadUnit): fix Vector priority related issues (#4101 ) Vector load should be the same as scalar load. Priority judgement needs to be made with the instructions for replay. Otherwise it will generate a stuck.	2024-12-30 17:28:55 +08:00
Anzo	0aeeba0ea3	fix(LoadUnit): `fastReplay` can only happen once (#4102 ) Currently, when `RAW` is full, a `RAW nack` is generated, which leads to `LoadQueueReplay`. And when `RAW` is non-empty, commands are reissued from `Replay`. Currently, a load instruction goes into `LoadUnit` at `S2`, and then if an exception occurs, a `revoke` is generated at `S3`. Therefore, this will happen: `RAW` has only one item remaining. The instructions in `LoadQueueReplay` are sent to `LoadUnit1`. The Load instruction also exists in `LoadUnit0`, so `LoadUnit0` has access to `RAW`, while a Load in `LoadUnit1` produces a `RAW nack`. And `LoadUnit0` and `LoadUnit1` would generate `bank conflict`, thus causing `LoadUnit0` to get to `S3` to generate a `fast replay` and `revoke`, which would result in `RAW` being non-full, which would result in `RAW in `LoadQueueReplay` nack` command would be allowed to reissue. The reissued instruction will in turn create a `bank conflict` with `fast replay` and cause itself to create another `RAW nack` due to priority issues. When the above loop expands, it causes this to happen over and over again, leading to a jam. `Wu Shen` suggested that this bug could be solved by allowing `fast replay` to spawn only once.	2024-12-30 14:12:18 +08:00
zhanglinjuan	066ca2498b	fix(MemBlock): support non-data error handling for cacheable region (#4093 ) When DCache refill reponses with `denied` or `corrupt` asserted, the loads belonging to the cache line should report load access fault. This is accomplished by including a `corrupt` bit in the DCache MSHR forwarding and TileLink channel D forwarding logic and triggering excepion when `corrupt` is detected. Store non-data error that comes from DCache store miss is unable to trigger a precise access fault trap but an imprecise bus-error interrupt. And it will be included in another commit.	2024-12-27 18:54:23 +08:00
Anzo	6aee9d0b62	fix(LoadUnit): fix Load misalign related bugs (#4085 ) 1. Only if no `pf/af` occurs can it be considered a `mmio`. Thus allowing a non-aligned Load to generate a misalign exception. The store also suffers from this problem, but I will modify `StoreUnit` later in some other way 2. Prefetching shouldn't produce non-alignment, and I previously placed the logic for prefetching processing in the wrong place.	2024-12-27 10:37:09 +08:00
Yanqin Li	519244c70f	submodule(CoupledL2, OpenLLC): support pbmt in CHI scene (#4071 ) * L1: deliver the NC and PMA signals of uncacheReq to L2 * L2: [support Svpbmt on CHI MemAttr](https://github.com/OpenXiangShan/CoupledL2/pull/273) * LLC: [Non-cache requests are forwarded directly downstream without entering the slice](https://github.com/OpenXiangShan/OpenLLC/pull/28)	2024-12-25 10:03:38 +08:00
klin02	8b33cd30e0	feat(XSLog): move all XSLog outside WhenContext for collection As data in WhenContext is not acessible in another module. To support XSLog collection, we move all XSLog and related signal outside WhenContext. For example, when(cond1){XSDebug(cond2, pable)} to XSDebug(cond1 && cond2, pable)	2024-12-23 10:14:24 +08:00
Anzo	0ae34b3816	fix(LoadUnit): fix trigger exception when writeback and wakeup logic (#4057 ) When misaligned encounters mmio, we should actually generate the misaligned exception and write it back directly. Therefore `s2_real_exception`, instead of `s2_exception`, should be used for `s_safe_writeback` and `s2_wakeup` judgement.	2024-12-18 11:43:04 +08:00
Anzooooo	562eaa0c86	fix(MemBlock): fix misaligned exception and remove redundant reg from `SQ`	2024-12-17 00:15:56 +08:00
cz4e	72dab9745c	feat(CtrlUnit, DCache): support L1 DCache RAS (#4009 ) # L1 DCache RAS extension support The L1 DCache supports the part of Reliability, Availability, and Serviceability (RAS) Extension. * L1 DCache protection with Single Error Correct Double Error Detect (SECDED) ECC on the RAMs. This includes the L1 DChace tag and data RAMs. Not recovery error tag or data. * Fault Handling Interrupt (Bus Error Unit Interrupt,BEU, 65) * Error inject ## ECC Error Detect An error might be triggered, when access L1 DCache. * Error Report: * Tag ECC Error: As long as an ECC error occurs on a certain path, it is judged that an ECC error has occurred. * Data ECC Error: If an ECC error occurs in the hit line, it is considered that an ECC error has occurred. If it does not hit, it will not be processed. * If an instruction access triggers an ECC error, a Hardware error is considered and an exception is reported. * Whenever there is an error in starting, an error message needs to be sent to BEU. * When the hardware detects an error, it reports it to the BEU and triggers the NMI external interrupt(65). * Load instruction: * Only ECC errors of tags or data will be triggered during execution, and the errors will be reported to the BEU and a `Hardware Error` will be reported. * Probe/Snoop: * If a tag ecc error occurs, there is no need to change the cache status, and a `ProbeAck` with `corrupt=1` needs to be returned to l2. * If a data ecc error occurs, change the cache status according to the rules. If data needs to be returned, `ProbeAckData` with `corrupt=1` needs to be returned to l2. * Replace/Evict: * `ReleaseData` with `corrupt=1` needs to be returned to l2. * Store to L1 DCache: * If a tag ecc error occurs, the cacheline is released according to the `Repalce/Evict` process and the data is written to L1 DCache without reporting errors to l2. * If a data ecc error occurs, the data is written directly without reporting the error to l2. * Atomics: * report `Hardware Error`, do not report errors to l2. ## Error Inject Each core's L1 DCache is configured with a memory map register-controlled controller, and each hardware unit that supports ECC is configured with a control bank. After the Bank register configuration is completed, L1 DCache will trigger an ecc error for the first access L1 DCache. <div style="text-align: center;"> <img src="https://github.com/user-attachments/assets/8c4d23c5-0324-4e52-bcf4-29b47a282d72" alt="err_inject" width="200" /> </div> ### Address Space Address space `0x38022000`-`0x3802207F`, a total of 128 bytes of space, this space is the local space of each hart. <div style="text-align: center;"> <img width="292" alt="ctl_bank" src="https://github.com/user-attachments/assets/89f88b24-37a4-4786-a192-401759eb95cf"> </div> ### L1 DCache Control Bank Each Control Bank contains registers: `ECCCTL`, `ECCEID`, `ECCMASK`, each register is 8 bytes. <img width="414" alt="eccctl" src="https://github.com/user-attachments/assets/b22ff437-d05d-4b3c-a353-dbea1afdc156"> * ECCCTL(ECC Control): ECC injection control register. * `ese(error signaling enable)`: Indicates that the injection is valid and is initialized to 0. When the injection is successful and `pst==0`, ese will be clean. * `pst(persist)`: Continuously inject signals. When `pst==1`, the `ECCEID` counter decreases to 0 and after successful injection, the injection timer will be restored to the last set `ECCEID` and re-injected; when `pst==0`, it will be injected only once. * `ede(error delay enable)`: Indicates that counter is valid and initialized to 0. If * `ese==1` and `ede==0`, error injection is effective immediately. * `ese==1` and `ede==1`, you need to wait until `ECCEID` decrements to 0 before the injection is effective. * `cmp(component)`: Injection target, initialized to 0. * 1'b0: The injection object is tag. * 1'b1: The injection object is data. * `bank`: The bank valid signal is initialized to 0. When the bit in the `bank` is set, the corresponding mask is valid. <img width="414" alt="ecceid" src="https://github.com/user-attachments/assets/8cea0d8d-2540-44b1-b1f9-c1ed6ec5341e"> * ECCEID(ECC Error Inject Delay): ECC injection delay controller. * When `ese==1` and `ede==1`, it starts to decrease until it reaches 0. Currently, the same clock as the core frequency is used, which can also be divided. Since ECC injection relies on L1 DCache access, the time of the `EID` and the time when the ECC error is triggered may not be consistent. <img width="414" alt="eccmask" src="https://github.com/user-attachments/assets/b1be83fd-17a6-4324-8aa6-45858249c476"> * ECCMASK(ECC Mask): ECC injection mask register. * 0 means no inversion, 1 means flip. Tag injection only uses the bits in `ECCMASK0` corresponding to the tag length. ### Error Inject Example ``` 1 # set control bank base address 2 mv x3, $(BASEADDR) 3 4 # set eid 5 mv x5, 500 # delay 500 cycles 6 sd x5, 8(x3) # mmio store 7 8 # set mask 9 mv x5, 0x1 # flip bit 0 10 sd x5, 16(x3) # mmio store 11 12 # set ctl 13 mv x5, 0x7 # comp = 0, ede = 1, pst = 1, ese = 1 14 sd x5, 0(x3) # mmio store ```	2024-12-16 19:34:26 +08:00
Anzooooo	1b5499a2d9	fix(LSU): `rfwen` not be set when `WakeUp` cancelled or not need `WakeUp`	2024-12-11 10:11:07 +08:00
Anzooooo	b240e1c0b8	feat(Zicclsm): refactoring misalign and support vector misalign	2024-12-11 10:11:07 +08:00
Haoyuan Feng	189833a16f	feat(pointer masking): support Ssnpm & Smnpm & Smmpm (#3921 ) feat(pointer masking): support Ssnpm & Smnpm & Smmpm	2024-12-05 14:21:35 +08:00
Yanqin Li	e10e20c653	style(pbmt): remove the useless and standardize code * style(pbmt): remove outstanding constant which is just for self-test * fix(uncache): added mask comparison for `addrMatch` * style(mem): code normalization * fix(pbmt): handle cases where the load unit is byte, word, etc * style(uncache): fix an import * fix(uncahce): address match should use non-offset address when forwading In this case, to ensure correct forwarding, stores with the same address but overlapping masks cannot be entered at the same time. * style(RAR): remove redundant design of `nc` reg	2024-12-04 19:25:46 +08:00

1 2 3 4 5 ...

422 Commits