-
-
Notifications
You must be signed in to change notification settings - Fork 2.5k
ripgrep fails to match pattern including digit character class #1203
Copy link
Copy link
Closed
Labels
bugA bug.A bug.
Description
What version of ripgrep are you using?
ripgrep 0.10.0 (rev 8a7db1a918)
-SIMD -AVX (compiled)
+SIMD +AVX (runtime)
How did you install ripgrep?
$ brew tap burntsushi/ripgrep https://github.com/BurntSushi/ripgrep.git
$ brew install ripgrep-bin
What operating system are you using ripgrep on?
macOS 10.14.3 (18D109)
Describe your question, feature request, or bug.
rg appears to fail to find a certain pattern in a one-line file that definitely contains that pattern.
I must be missing something — this seems very unlikely to be a legitimate bug — but I can't figure out what.
If this is a bug, what are the steps to reproduce the behavior?
echo 153.230000 >| test.txtrg '\d\d\d00' test.txt. This successfully finds a match of23000.rg '\d\d\d000' test.txt. This fails to find any match, when it should match230000
Note that grep '\d\d\d000' test.txt correctly matches 230000. (grep --version grep (BSD grep) 2.5.1-FreeBSD)
If this is a bug, what is the actual behavior?
$ echo 153.230000 >| test.txt
$ rg --debug '\d\d\d000' test.txt
DEBUG|grep_regex::literal|grep-regex/src/literal.rs:110: required literal found: "000"
DEBUG|globset|globset/src/lib.rs:429: built glob set; 0 literals, 0 basenames, 8 extensions, 0 prefixes, 0 suffixes, 0 required extensions, 0 regexes
DEBUG|globset|globset/src/lib.rs:424: glob converted to regex: Glob { glob: "**/.*.s[a-w][a-z]", re: "(?-u)^(?:/?|.*/)\\..*\\.s[a-w][a-z]$", opts: GlobOptions { case_insensitive: false, literal_separator: false, backslash_escape: true }, tokens: Tokens([RecursivePrefix, Literal('.'), ZeroOrMore, Literal('.'), Literal('s'), Class { negated: false, ranges: [('a', 'w')] }, Class { negated: false, ranges: [('a', 'z')] }]) }
DEBUG|globset|globset/src/lib.rs:429: built glob set; 0 literals, 3 basenames, 0 extensions, 0 prefixes, 0 suffixes, 0 required extensions, 1 regexes
$
If this is a bug, what is the expected behavior?
rg '\d\d\d000' test.txt should identify the single match in the file, as grep does. Specifically:
$ rg '\d\d\d000' test.txt
1:153.230000
Other
Note that changing the corpus in seemingly irrelevant ways can cause the bug to change or disappear. For example, the \d\d\d000 pattern matches if three 0 characters are prepended to the contents of the file (that is, the file contains 000153.230000).
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
bugA bug.A bug.