Skip to content

Multiline lookbehind, empty submatches array #1412

@roblourens

Description

@roblourens

If I search with a lookbehind pattern that matches a newline, and use the --json flag, I get the match info but it has an empty submatches array. Example:

$ echo -e 'foo\nbar' > test.txt
$ rg --auto-hybrid-regex --multiline --json '(?<=foo\n)bar' test.txt
{"type":"begin","data":{"path":{"text":"test.txt"}}}
{"type":"match","data":{"path":{"text":"test.txt"},"lines":{"text":"bar\n"},"line_number":2,"absolute_offset":4,"submatches":[]}}
{"type":"end","data":{"path":{"text":"test.txt"},"binary_offset":null,"stats":{"elapsed":{"secs":0,"nanos":40700,"human":"0.000041s"},"searches":1,"searches_with_match":1,"bytes_searched":8,"bytes_printed":183,"matched_lines":1,"matches":0}}}
{"data":{"elapsed_total":{"human":"0.002328s","nanos":2328499,"secs":0},"stats":{"bytes_printed":183,"bytes_searched":8,"elapsed":{"human":"0.000041s","nanos":40700,"secs":0},"matched_lines":1,"matches":0,"searches":1,"searches_with_match":1}},"type":"summary"}

Compare to what I get when matching with lookbehind on one line:

$ rg --auto-hybrid-regex --multiline --json '(?<=b)ar' test.txt 
...
{"type":"match","data":{"path":{"text":"test.txt"},"lines":{"text":"bar\n"},"line_number":2,"absolute_offset":4,"submatches":[{"match":{"text":"ar"},"start":1,"end":3}]}}

Or just matching that line

$ rg --auto-hybrid-regex --multiline --json 'bar' test.txt
...
{"type":"match","data":{"path":{"text":"test.txt"},"lines":{"text":"bar\n"},"line_number":2,"absolute_offset":4,"submatches":[{"match":{"text":"bar"},"start":0,"end":3}]}}

Same with lookahead that matches a newline.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugA bug.rollupA PR that has been merged with many others in a rollup.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions