Commit Graph

138 Commits

Author SHA1 Message Date
George Krajcsovits acd9aa0afb
fix(textparse/protobuf): metric family name corrupted by NHCB parser (#17156)
* fix(textparse): implement NHCB parsing in ProtoBuf parser directly

The NHCB conversion does some validation, but we can only return error
from Parser.Next() not Parser.Histogram(). So the conversion needs to
happen in Next().

There are 2 cases:
1. "always_scrape_classic_histograms" is enabled, in which case we
convert after returning the classic series. This is to be consistent
with the PromParser text parser, which collects NHCB while spitting out
classic series; then returns the NHCB.
2. "always_scrape_classic_histograms" is disabled. In which case we never
return the classic series.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* refactor(textparse): skip classic series instead of adding NHCB around

Do not return the first classic series from the EntryType state,
switch to EntrySeries. This means we need to start the histogram
field state from -3 , not -2.

In EntrySeries, skip classic series if needed.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* reuse nhcb converter

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* test(textparse/nhcb): test corrupting metric family name

NHCB parse doesn't always copy the metric name from the underlying
parser. When called via HELP, UNIT, the string is directly referenced
which means that the read-ahead of NHCB can corrupt it.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2025-09-08 17:26:41 +02:00
Arve Knudsen 913cc8f72b
Replace gopkg.in/yaml.v2 with go.yaml.in/yaml/v2 (#17151)
* Replace gopkg.in/yaml.v2 with go.yaml.in/yaml/v2
* Upgrade to client_golang@v1.23.2

---------

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2025-09-06 13:04:24 +02:00
beorn7 747c5ee2b1 Apply analyzer "modernize" to the whole codebase
See
https://pkg.go.dev/golang.org/x/tools/gopls/internal/analysis/modernize
for details.

This ran into a few issues (arguably bugs in the modernize tool),
which I will fix in the next commit, so that we have transparency what
was done automatically.

Beyond those hiccups, I believe all the changes applied are
legitimate. Even where there might be no tangible direct gain, I would
argue it's still better to use the "modern" way to avoid micro
discussions in tiny style PRs later.

Signed-off-by: beorn7 <beorn@grafana.com>
2025-08-27 14:48:41 +02:00
Arve Knudsen 0a40df33fb
Make metric/label name validation scheme explicit (#16928)
* Parameterize metric/label name validation scheme

Parameterized metric/label name validation scheme

---------

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Co-authored-by: Julius Hinze <julius.hinze@grafana.com>
2025-08-18 08:09:00 +00:00
George Krajcsovits 929bd787ec
fix(ci): run linter with all build tags (#17027)
Fix up lint errors that were not previously checked.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
2025-08-08 09:43:41 +00:00
Matthieu MOREL cef219c31c chore: enable unused-receiver rule from revive
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2025-08-04 09:43:33 +00:00
Lukasz Mierzwa 559fd44be6 Rename labels.go -> labels_slicelabels.go
labels.go is now holding slicelabels code, so let's rename it.

Signed-off-by: Lukasz Mierzwa <l.mierzwa@gmail.com>
2025-07-07 12:37:42 +01:00
Jon Kartago Lamida 819500bdbc
Add ByteSize method for Labels (#16717)
Add `ByteSize()` method to different labels implementations.
One of the use case so that we can track the memory used by Labels.

Signed-off-by: Jon Kartago Lamida <me@lamida.net>
2025-07-04 15:09:01 +01:00
Bryan Boreham 4eafbcae93 lint
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2025-07-02 09:56:28 +01:00
Bryan Boreham e7ac3f440d [TESTS] Labels: Add a test for SizeOfLabels
This requires a bit of repetition to cover all the different builds, but
it seems worth checking that the function does what is expected.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2025-07-02 09:31:27 +01:00
Bryan Boreham 507227781b [REFACTOR] Labels: Extract test case data from TestLabels_String
So we can use them in other tests.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2025-07-02 09:31:25 +01:00
Bartlomiej Plotka 8e6b008608
feature: type-and-unit-labels (PROM-39 implementation) (#16228)
buf.build / lint and publish (push) Has been cancelled Details
CI / Go tests (push) Has been cancelled Details
CI / More Go tests (push) Has been cancelled Details
CI / Go tests with previous Go version (push) Has been cancelled Details
CI / UI tests (push) Has been cancelled Details
CI / Go tests on Windows (push) Has been cancelled Details
CI / Mixins tests (push) Has been cancelled Details
CI / Build Prometheus for common architectures (0) (push) Has been cancelled Details
CI / Build Prometheus for common architectures (1) (push) Has been cancelled Details
CI / Build Prometheus for common architectures (2) (push) Has been cancelled Details
CI / Build Prometheus for all architectures (0) (push) Has been cancelled Details
CI / Build Prometheus for all architectures (1) (push) Has been cancelled Details
CI / Build Prometheus for all architectures (10) (push) Has been cancelled Details
CI / Build Prometheus for all architectures (11) (push) Has been cancelled Details
CI / Build Prometheus for all architectures (2) (push) Has been cancelled Details
CI / Build Prometheus for all architectures (3) (push) Has been cancelled Details
CI / Build Prometheus for all architectures (4) (push) Has been cancelled Details
CI / Build Prometheus for all architectures (5) (push) Has been cancelled Details
CI / Build Prometheus for all architectures (6) (push) Has been cancelled Details
CI / Build Prometheus for all architectures (7) (push) Has been cancelled Details
CI / Build Prometheus for all architectures (8) (push) Has been cancelled Details
CI / Build Prometheus for all architectures (9) (push) Has been cancelled Details
CI / Check generated parser (push) Has been cancelled Details
CI / golangci-lint (push) Has been cancelled Details
CI / fuzzing (push) Has been cancelled Details
CI / codeql (push) Has been cancelled Details
Scorecards supply-chain security / Scorecards analysis (push) Has been cancelled Details
CI / Report status of build Prometheus for all architectures (push) Has been cancelled Details
CI / Publish main branch artifacts (push) Has been cancelled Details
CI / Publish release artefacts (push) Has been cancelled Details
CI / Publish UI on npm Registry (push) Has been cancelled Details
* feature: type-and-unit-labels (extended MetricIdentity)

Experimental implementation of https://github.com/prometheus/proposals/pull/39

Previous (unmerged) experiments:
* https://github.com/prometheus/prometheus/compare/main...dashpole:prometheus:type_and_unit_labels
* https://github.com/prometheus/prometheus/pull/16025

Signed-off-by: bwplotka <bwplotka@gmail.com>

feature: type-and-unit-labels (extended MetricIdentity)

Experimental implementation of https://github.com/prometheus/proposals/pull/39

Previous (unmerged) experiments:
* https://github.com/prometheus/prometheus/compare/main...dashpole:prometheus:type_and_unit_labels
* https://github.com/prometheus/prometheus/pull/16025

Signed-off-by: bwplotka <bwplotka@gmail.com>

* Fix compilation errors

Signed-off-by: Arthur Silva Sens <arthursens2005@gmail.com>

Lint

Signed-off-by: Arthur Silva Sens <arthursens2005@gmail.com>

Revert change made to protobuf 'Accept' header

Signed-off-by: Arthur Silva Sens <arthursens2005@gmail.com>

Fix compilation errors for 'dedupelabels' tag

Signed-off-by: Arthur Silva Sens <arthursens2005@gmail.com>

* Rectored into schema.Metadata

Signed-off-by: bwplotka <bwplotka@gmail.com>

* texparse: Added tests for PromParse

Signed-off-by: bwplotka <bwplotka@gmail.com>

* add OM tests.

Signed-off-by: bwplotka <bwplotka@gmail.com>

* add proto tests

Signed-off-by: bwplotka <bwplotka@gmail.com>

* Addressed comments.

Signed-off-by: bwplotka <bwplotka@gmail.com>

* add schema label tests.

Signed-off-by: bwplotka <bwplotka@gmail.com>

* addressed comments.

Signed-off-by: bwplotka <bwplotka@gmail.com>

* fix tests.

Signed-off-by: bwplotka <bwplotka@gmail.com>

* add promql tests.

Signed-off-by: bwplotka <bwplotka@gmail.com>

* lint

Signed-off-by: bwplotka <bwplotka@gmail.com>

* Addressed comments.

Signed-off-by: bwplotka <bwplotka@gmail.com>

---------

Signed-off-by: bwplotka <bwplotka@gmail.com>
Signed-off-by: Arthur Silva Sens <arthursens2005@gmail.com>
Co-authored-by: Arthur Silva Sens <arthursens2005@gmail.com>
2025-05-17 09:37:25 +00:00
hardlydearly ba4b058b7a refactor: use slices.Contains to simplify code
buf.build / lint and publish (push) Has been cancelled Details
CI / Go tests (push) Has been cancelled Details
CI / More Go tests (push) Has been cancelled Details
CI / Go tests with previous Go version (push) Has been cancelled Details
CI / UI tests (push) Has been cancelled Details
CI / Go tests on Windows (push) Has been cancelled Details
CI / Mixins tests (push) Has been cancelled Details
CI / Build Prometheus for common architectures (0) (push) Has been cancelled Details
CI / Build Prometheus for common architectures (1) (push) Has been cancelled Details
CI / Build Prometheus for common architectures (2) (push) Has been cancelled Details
CI / Build Prometheus for all architectures (0) (push) Has been cancelled Details
CI / Build Prometheus for all architectures (1) (push) Has been cancelled Details
CI / Build Prometheus for all architectures (10) (push) Has been cancelled Details
CI / Build Prometheus for all architectures (11) (push) Has been cancelled Details
CI / Build Prometheus for all architectures (2) (push) Has been cancelled Details
CI / Build Prometheus for all architectures (3) (push) Has been cancelled Details
CI / Build Prometheus for all architectures (4) (push) Has been cancelled Details
CI / Build Prometheus for all architectures (5) (push) Has been cancelled Details
CI / Build Prometheus for all architectures (6) (push) Has been cancelled Details
CI / Build Prometheus for all architectures (7) (push) Has been cancelled Details
CI / Build Prometheus for all architectures (8) (push) Has been cancelled Details
CI / Build Prometheus for all architectures (9) (push) Has been cancelled Details
CI / Check generated parser (push) Has been cancelled Details
CI / golangci-lint (push) Has been cancelled Details
CI / fuzzing (push) Has been cancelled Details
CI / codeql (push) Has been cancelled Details
Scorecards supply-chain security / Scorecards analysis (push) Has been cancelled Details
Stale Check / stale (push) Has been cancelled Details
Lock Threads / action (push) Has been cancelled Details
CI / Report status of build Prometheus for all architectures (push) Has been cancelled Details
CI / Publish main branch artifacts (push) Has been cancelled Details
CI / Publish release artefacts (push) Has been cancelled Details
CI / Publish UI on npm Registry (push) Has been cancelled Details
Signed-off-by: hardlydearly <799511800@qq.com>
2025-05-09 08:27:10 +02:00
Arve Knudsen e7e3ab2824
Fix linting issues found by golangci-lint v2.0.2 (#16368)
buf.build / lint and publish (push) Waiting to run Details
CI / Go tests (push) Waiting to run Details
CI / More Go tests (push) Waiting to run Details
CI / Go tests with previous Go version (push) Waiting to run Details
CI / UI tests (push) Waiting to run Details
CI / Go tests on Windows (push) Waiting to run Details
CI / Mixins tests (push) Waiting to run Details
CI / Build Prometheus for common architectures (0) (push) Waiting to run Details
CI / Build Prometheus for common architectures (1) (push) Waiting to run Details
CI / Build Prometheus for common architectures (2) (push) Waiting to run Details
CI / Build Prometheus for all architectures (0) (push) Waiting to run Details
CI / Build Prometheus for all architectures (1) (push) Waiting to run Details
CI / Build Prometheus for all architectures (10) (push) Waiting to run Details
CI / Build Prometheus for all architectures (11) (push) Waiting to run Details
CI / Build Prometheus for all architectures (2) (push) Waiting to run Details
CI / Build Prometheus for all architectures (3) (push) Waiting to run Details
CI / Build Prometheus for all architectures (4) (push) Waiting to run Details
CI / Build Prometheus for all architectures (5) (push) Waiting to run Details
CI / Build Prometheus for all architectures (6) (push) Waiting to run Details
CI / Build Prometheus for all architectures (7) (push) Waiting to run Details
CI / Build Prometheus for all architectures (8) (push) Waiting to run Details
CI / Build Prometheus for all architectures (9) (push) Waiting to run Details
CI / Report status of build Prometheus for all architectures (push) Blocked by required conditions Details
CI / Check generated parser (push) Waiting to run Details
CI / golangci-lint (push) Waiting to run Details
CI / fuzzing (push) Waiting to run Details
CI / codeql (push) Waiting to run Details
CI / Publish main branch artifacts (push) Blocked by required conditions Details
CI / Publish release artefacts (push) Blocked by required conditions Details
CI / Publish UI on npm Registry (push) Blocked by required conditions Details
Scorecards supply-chain security / Scorecards analysis (push) Waiting to run Details
* Fix linting issues found by golangci-lint v2.0.2

---------

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2025-05-03 19:05:13 +02:00
Bryan Boreham ca416c580c
Merge branch 'main' into slicelabels
Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2025-05-02 10:31:57 +01:00
Bryan Boreham b2c2146d7c
Labels: simpler/faster stringlabels encoding (#16069)
Instead of using varint to encode the size of each label, use a single
byte for size 0-254, or a flag value of 255 followed by the size in
3 bytes little-endian.

This reduces the amount of code, and also the number of branches in
commonly-executed code, so it runs faster.

The maximum allowed label name or value length is now 2^24 or 16MB.

Memory used by labels changes as follows:
* Labels from 0 to 127 bytes length: same
* From 128 to 254: 1 byte less
* From 255 to 16383: 2 bytes more
* From 16384 to 2MB: 1 byte more
* From 2MB to 16MB: same

Labels: panic on string too long.

Slightly more user-friendly than encoding bad data and finding out when
we decode.

Clarify that Labels.Bytes() encoding can change

---------

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2025-04-30 10:53:48 +01:00
Lukasz Mierzwa 05088aaa12 Fix linter errors
Mostly comment issues and unused variables.

Signed-off-by: Lukasz Mierzwa <l.mierzwa@gmail.com>
2025-04-15 18:04:41 +01:00
Lukasz Mierzwa bb76966992 Use stringlabels by default
This removes the stringlabels build tag, makes that implementation the default one, and moves the old labels implementation under the slicelabels build tag.
Fixes #16064.

Signed-off-by: Lukasz Mierzwa <l.mierzwa@gmail.com>
2025-04-15 17:52:24 +01:00
wellweek 4e91f13db2 refactor: use slices.Equal to simplify code
Signed-off-by: wellweek <xiezitai@outlook.com>
2025-03-27 12:17:35 +01:00
Owen Williams 94b43c5d4c utf8: Remove support for legacy global validation setting
Global and Data Source configurations can specify legacy mode, but Prometheus now requires that the overall validation mode be set to UTF-8

Signed-off-by: Owen Williams <owen.williams@grafana.com>
2025-03-13 10:47:24 -04:00
Matthieu MOREL c7d4b53ec1 chore: enable unused-parameter from revive
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2025-02-19 19:50:28 +01:00
Bryan Boreham ac4f8a5e23
[ENHANCEMENT] TSDB: Improve calculation of space used by labels (#13880)
* [ENHANCEMENT] TSDB: Improve calculation of space used by labels

The labels for each series in the Head take up some some space in the
Postings index, but far more space in the `memSeries` structure.

Instead of having the Postings index calculate this overhead, which is
a layering violation, have the caller pass in a function to do it.

Provide three implementations of this function for the three Labels
versions.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-12-16 09:42:52 +00:00
Owen Williams 8d4bcd2c77 promql: Fix various UTF-8 bugs related to quoting
Fixes UTF-8 aggregator label list items getting mutated with quote marks when String-ified.
Fixes quoted metric names not supported in metric declarations.
Fixes UTF-8 label names not being quoted when String-ified.

Fixes https://github.com/prometheus/prometheus/issues/15470
Fixes https://github.com/prometheus/prometheus/issues/15528

Signed-off-by: Owen Williams <owen.williams@grafana.com>
Co-authored-by: Bryan Boreham <bjboreham@gmail.com>
2024-12-04 14:18:59 -05:00
Arve Knudsen 89bbb885e5
Upgrade to golangci-lint v1.62.0 (#15424)
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
2024-11-20 17:22:20 +01:00
Bryan Boreham 5571c7dc98 FastRegexMatcher: use stack memory for lowercase copy of string
Up to 32-byte values this saves garbage, runs faster.
For prefixes, only `toLower` the part we need for the map lookup.

Split toNormalisedLower into fast and slow paths, to avoid a penalty
for the `copy` call in the case where no allocations are done.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-10-28 16:28:58 +00:00
Bryan Boreham 31c5760551
Neater string vs byte-slice conversions (#14425)
unsafe.Slice and unsafe.StringData were added in Go 1.20

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-09-21 12:19:21 +02:00
Mario Fernandez 5814920601
Fix: optimize .* regexp performance
Shortcut for `.*` matches newlines as well.
Add preamble change ^(?s:
Add test
dotAll flag por al regex
Add and fix regex tests

Signed-off-by: Mario Fernandez <mariofer@redhat.com>
2024-09-17 12:18:31 +02:00
Owen Williams 9da75328ea
fix(utf8): ensure correct validation when legacy mode turned on (#14736)
fix(utf8): ensure correct validation when legacy mode turned on

This depends on the included update of the prometheus/common dependency.

---------

Signed-off-by: Owen Williams <owen.williams@grafana.com>
2024-08-28 17:15:42 +02:00
beorn7 0f760f63dd lint: Revamp our linting rules, mostly around doc comments
Several things done here:

- Set `max-issues-per-linter` to 0 so that we actually see all linter
  warnings and not just 50 per linter. (As we also set
  `max-same-issues` to 0, I assume this was the intention from the
  beginning.)

- Stop using the golangci-lint default excludes (by setting
  `exclude-use-default: false`. Those are too generous and don't match
  our style conventions. (I have re-added some of the excludes
  explicitly in this commit. See below.)

- Re-add the `errcheck` exclusion we have used so far via the
  defaults.

- Exclude the signature requirement `govet` has for `Seek` methods
  because we use non-standard `Seek` methods a lot. (But we keep other
  requirements, while the default excludes completely disabled the
  check for common method segnatures.)

- Exclude warnings about missing doc comments on exported symbols. (We
  used to be pretty adamant about doc comments, but stopped that at
  some point in the past. By now, we have about 500 missing doc
  comments. We may consider reintroducing this check, but that's
  outside of the scope of this commit. The default excludes of
  golangci-lint essentially ignore doc comments completely.)

- By stop using the default excludes, we now get warnings back on
  malformed doc comments. That's the most impactful change in this
  commit. It does not enforce doc comments (again), but _if_ there is
  a doc comment, it has to have the recommended form. (Most of the
  changes in this commit are fixing this form.)

- Improve wording/spelling of some comments in .golangci.yml, and
  remove an outdated comment.

- Leave `package-comments` inactive, but add a TODO asking if we
  should change that.

- Add a new sub-linter `comment-spacings` (and fix corresponding
  comments), which avoids missing spaces after the leading `//`.

Signed-off-by: beorn7 <beorn@grafana.com>
2024-08-22 17:36:11 +02:00
Bryan Boreham d84282b105 Labels: use single byte as separator - small speedup
Since `seps` is a variable, `seps[0]` has to be bounds-checked every
time. Replacing with a constant everywhere it is used skips this
overhead.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-07-15 09:47:16 +01:00
Bryan Boreham 82a8c6abe2
[ENHANCEMENT] Optimize regexps with multiple prefixes (#13843)
For example `foo.*|bar.*|baz.*`. Instead of checking each one in turn,
we build a map of prefixes, then check the smaller set that could match
the string supplied.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

* Improve testing and readability

Address review comments on #13843

Signed-off-by: Marco Pracucci <marco@pracucci.com>
2024-07-03 18:45:36 +01:00
Marco Pracucci ec31acaf02
FastRegexMatcher: small optimization for the literal prefix case
Signed-off-by: Marco Pracucci <marco@pracucci.com>
2024-07-01 10:12:50 +02:00
Bryan Boreham 7a82e4b503 Labels benchmarks: remove artefact of small symbol-tables
Symbol tables with fewer than 128 entries, so everything can be
represented as a single byte, are not realistic.

Stuff the symbol table with fake entries before adding the real ones.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-06-21 16:49:10 +01:00
Bryan Boreham 2ba7bc9446 Labels: further optimisation for dedupelabels
Inline (by copy-paste) the fast path of `decodeVarint` in various
places where it gets called a lot.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-06-21 16:46:13 +01:00
Bryan Boreham 2ced2f6aec [PERF] Labels: faster varint for dedupelabels
Including tests.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
2024-06-21 11:57:09 +01:00
Bryan Boreham 84602bbace
Merge branch 'main' into fix-matcher-string-with-empty-label-name 2024-06-19 05:56:25 -04:00
Oleg Zaytsev 4f78cc809c
Refactor `toNormalisedLower`: shorter and slightly faster. (#14299)
Refactor toNormalisedLower: shorter and slightly faster

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-06-18 09:57:37 +00:00
Oleg Zaytsev 03cf6141d4
Fix Matcher.String() with empty label name
When the label name is empty, which can happen now with quoted label
name, it should be quoted when printed as a string again.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-06-13 18:46:35 +02:00
Ranveer Avhad 39902ba694
[BUGFIX] FastRegexpMatcher: do Unicode normalization as part of case-insensitive comparison (#14170)
* Converted string to standarized form
* Added golang.org/x/text in Go dependencies
* Added test cases for FastRegexMatcher
* Added benchmark for toNormalizedLower

Signed-off-by: RA <ranveeravhad777@gmail.com>
2024-06-10 18:31:41 -04:00
Marco Pracucci d966ae6400
Optimize containsInOrder() inlining it
Signed-off-by: Marco Pracucci <marco@pracucci.com>
2024-06-04 10:34:15 +02:00
Marco Pracucci a0807733be
Improved tests
Signed-off-by: Marco Pracucci <marco@pracucci.com>
2024-06-04 10:34:15 +02:00
Marco Pracucci 78fdd2188d
Improve contains check done by FastRegexMatcher
Signed-off-by: Marco Pracucci <marco@pracucci.com>
2024-06-04 10:34:15 +02:00
Bryan Boreham 1e0b0e250a
Merge pull request #14090 from colega/improve-zeroOrOneCharacterStringMatcher-Matches
Improve `zeroOrOneCharacterStringMatcher` by using `utf8.DecodeRuneInString`
2024-05-16 09:28:53 +01:00
Oleksandr Redko f10c3454e9 Enable perfsprint linter and fix up code
Signed-off-by: Oleksandr Redko <oleksandr.red+github@gmail.com>
2024-05-15 17:51:05 +03:00
Oleg Zaytsev 8b4c9459a2
Check utf8.RuneError result
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-05-13 17:44:26 +02:00
Oleg Zaytsev dbe88fae22
Add invalid utf8 test cases to regexp
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-05-13 17:05:31 +02:00
Oleg Zaytsev bcff5059e6
Use utf8.DecodeRuneInString(s)
This replaces the custom `moreThanOneRune` function with the standard
`utf8.DecodeRuneInString(s)` that can be used to figure out the size of
the first rune.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-05-13 15:41:00 +02:00
Oleg Zaytsev fdfc6d4725
Benchmark zeroOrOneCharacterStringMatcher.Matches
This adds some more test cases for unicode values, and also a benchmark
for zeroOrOneCharacterStringMatcher.Matches()

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-05-13 15:36:55 +02:00
Oleg Zaytsev b7b4355807
Use bytes.Buffer from stack buf in Matcher.String()
Also removed the growing until there's a benchmark for that.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-05-09 10:00:24 +02:00
Oleg Zaytsev 6ebda5a7bc
Optimize Matcher.String()
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
2024-05-08 17:05:27 +02:00