Menu
  • HOME
  • TAGS

How NOT to match in just one regex

php,regex,match,pcre

Zerquix, here's the most compact regex I could think of. It only matches the right ones. $regex = "~\\\\%(*SKIP)(*F)|%[sdf]~"; $string = "match %s, %d and %f but NOT with \%s, \%d and \%f."; if(preg_match_all($regex,$string,$m)) print_r($m); See live demo...

UltraEdit: Deleting all lines under a certain length with \n and or \r

regex,pcre,ultraedit

Because ^.{0,5}$\n\r is not the same as ^.{0,5}$\r\n. \n\r is a linefeed followed by carriage return. \r\n is a carriage return followed by linefeed - a popular line ending combination of characters. Specifically \r\n is used by the MS-DOS and Windows family of operating systems, among others. ...

Search html string using regular expression and store in array in PHP

php,html,regex,pcre

Update It seems that you are reading your regexp results wrong way. Executing preg_match_all('/<div(\s)+class="icon_star">.*?<\/div>/i', $html, $result_array1); for($x = 0; $x < count($result_array1); $x++) $result_array1[$x] = array_map('htmlentities', $result_array1[$x]); echo '<pre>' . print_r($result_array1, 1); prints out Array ( [0] => Array ( [0] => <div class="icon_star">&nbsp;</div> ) [1] => Array ( [0]...

PHP preg_match exclude

php,regex,preg-match,pcre,exclude

Your negative lookahead should have .* in front to allow for 0 or more characters before not-allowed text. Also use anchors in your regex. regex should be: preg_match('/^.*?1920x1200.*$(*SKIP)(*F)|(?:\d+[a-z]|[a-z]+\d)[a-z\d]*/im') RegEx Demo...

regex “something or nothing” match

regex,pcre,splunk

You can use this regex: \/A\/B\/C\/D\/([^\/]+)\/(\w+)?(?:\.(\w+))?\.(?:log|trc) In case you have other characters than just letters, numbers or underscore (captured by \w), you can limit to [^.]+: \/A\/B\/C\/D\/([^\/]+)\/([^.]+)?(?:\.([^.]+))?\.(?:log|trc) See Demo 1 and Demo 2...

Getting multiple subpatterns with the same name

php,regex,preg-match,pcre

Should be able to use the \G anchor for this. # '~(?:(?!\A)\G|^Use\s+),?\s*(?<ns>[^,;]+)(?=(?:,|[^,;]*)*;)~mi' (?xmi-) # Inline modifier = expanded, multiline, case insensitive (?: (?! \A ) # Not beginning of string \G # If matched before, start at end of last match | # or, ^ Use \s+ # Beginning of...

PCRE pattern that validates more than I wants

regex,expression,pcre

You can force the string to match if only it ends with a slash or an alphanumeric using a positive look-ahead (?=.*(?:\/|[[:alnum:]_-])$) at the beginning: (?=.*(?:\/|[[:alnum:]_-])$)(?:\/Services\/(?'var'[[:alnum:]_-]+)|(?<!^)\G)\/(?:(?'params'[[:alnum:]_-]+))? Note I am using multiline mode assuming you have these strings as separate entities. See demo...

Lazy quantifier and /s modifier

regex,pcre

Just turn the inbetween .+ to .+?, so that it would do a non-greedy match or otherwise it would greedily match many characters as much as possible. <a.*?href=\"(.+?.pdf\?min)\".*?>(.*?)<\/a> ^ DEMO...

Can't link libpcrecpp w/ MinGW

c++,linker,mingw,gnu,pcre

You have compile pcrecpp as static library and you need to define PCRE_STATIC when compiling your code, see https://github.com/vmg/pcre/blob/a257f5c7acc12e64dc2b5aa170b8e4b87dc34f83/pcreposix.h#L117 i586-mingw32msvc-g++ -std=c++11 -o test.exe -DPCRE_STATIC -Ipcre-install/include test.cpp \ pcre-install/lib/libpcre.a \ pcre-install/lib/libpcrecpp.a \ pcre-install/lib/libpcreposix.a Without PCRE_STATIC all public functions marked as dllimport and have different name mangling...

Regular expression to match no-extension file name string

regex,pcre

You can use this https://regex101.com/r/mI9qC6/1 ^Content-(Disposition|Type).*name\s*=\s*"?([^.]*((\.|=2E)( ade|adp|asp|bas|bat|chm|cmd|com|cpl|crt|dll|exe| hlp|ht[at]| inf|ins|isp|jse?|lnk|md[betw]|ms[cipt]|nws| \{[[:xdigit:]]{8}(?:-[[:xdigit:]]{4}){3}-[[:xdigit:]]{12}\}| ops|pcd|pif|prf|reg|sc[frt]|sh[bsm]|swf| vb[esx]?|vxd|ws[cfh]))?)(\?=)?"?\s*(;|$) ...

Regex grouping including subgroups of ors and maintaining invariant of capture count

regex,pcre

The capture group amount is equal to the parenthesis structure count and not changed by the presence of alternations. Which is why, if you add together different regexes with capture groups by alternations, you will have more groups than you'd like and you either have to change the regex or...

How to match the middle character in a string with regex?

java,regex,perl,pcre

With PCRE and Perl (and probably Java) you could use: ^(?:.(?=.*?(?(1)(?=.\1$))(.\1?$)))*(.) which would capture the middle character of odd length strings in the 2nd capturing group. Explained: ^ # beginning of the string (?: # loop . # match a single character (?= # non-greedy lookahead to towards the end...

Regular Expression from csv like to php array declaration

regex,preg-replace,pcre

You could do multiple search and replaces that search for ^ *(\w+)(.*)\r\n *(\w+) and replace with !\1=\3\r\n\2\r\n, assuming Windows style line endings. Then run another simple edit to remove the leading ! characters. Using the example from the question field1 field2 field3 field4 ... value1 value2 value3 value4 ... The...

Selectively replace words with regular expressions

regex,sublimetext3,pcre

It is not possible to put conditional statements in a replacement string or to store datas (that are not in the string) in the pattern itself. The more simple way with sublimetext is obviously to proceed in several steps (replace special strings before, and replace the general case after). The...

RegEx: Match nth occurence

regex,pcre

You can use this regex: _name=(?:[^"]*"){3} RegEx Demo...

Invalid Lookbehind With HTML Tags

regex,sublimetext,sublimetext3,pcre

The problems were: First, as MattDMo commented, Sublime uses PCRE regex engine. Second, the third borken down part of the regex, (?!\s*<\/section) was a negative lookahead, and it should be a positive one (if followed by, and not if NOT followed by). That would be (?=\s*</section). Third , In PCRE...

How to remove all alphanumeric words from the text?

php,regex,pcre

How about replacing \b(?=[a-z]+\d|[a-z]*\d+[a-z]+)\w*\b\s* with nothing? Demo: https://regex101.com/r/jA2fW3/1 Pattern code: $pattern = '/\b(?=[a-z]+\d|[a-z]*\d+[a-z]+)\w*\b\s*/i'; To match alphanumeric words containing foreign/accented letters, use the following pattern: $pattern = '/\b(?=[\pL]+\d|[\pL]*\d+[\pL]+)[\pL\w]*\b\s*/i'; Demo: https://regex101.com/r/jA2fW3/3...

shorter way of a regex

php,regex,pcre

You can use: /\b([1-9])\g{1}0\b/ RegEx Demo Breakup of regex: \b # word boundary [1-9] # match digit 1-9 and group them as captured group #1 \g{1} # back-reference to group #1 0 # match 0 \b # word boundary ...

Why does this Regex match even though it should fail?

regex,nginx,pcre

Your regex matches the invalid strings because the ID is optional: [0-9]* can match an empty string. Simply replace the * with a + to require at least one digit. Here's an improved version BTW: forum\/index\.php.+?\bmsg=?(\d+) Demo You should have escaped the .. I also added \b just before msg...

Swift regular expression format?

ios,regex,swift,pcre

Within double quotes, a single backslash would be readed as an escape sequence. You need to escape all the backslashes one more time in-order to consider it as a regex backslash character. "^([1-9]\\d{0,2}(,\\d{3})*|([1-9]\\d*))(\\.\\d{2})?$" ...

Regex finding all maches [duplicate]

regex,pcre

To make .* non-greedy (lazy), you should use ?: items\/[sS]tart.*?itempopup\(event,\'(\d+) .* is greedy means that the engine repeats it as many times as it can, so the regex continues to try to match the . with next characters, resulting with matching the whole tokens, then it'll backtrack until it matches...

Regex Lookarounds, prevent ahead and behind

php,regex,pcre

with PCRE/php you could use (*SKIP)(*FAIL) \d*\s?\d+\/\d+"*\s*(in|")\sx\s\d*\s?\d+\/\d+"*\s*(in|")(*SKIP)(*FAIL)|\d*\s?\d+\/\d+"*\s*(in|") Demo...

php shell_exec regular expression PCRE/POSIX

php,regex,posix,pcre,shell-exec

Thank you all, I find the solution, i just add option -P for grep in my script like this : result=$(grep -P -c "${1}" myLongFile.txt) Now, i can use \s -P, --perl-regexp Interpret PATTERN as a Perl regular expression. ...

Regex matching lines not containing word EMPTY

regex,pcre

Put .+ or .* after the negative lookahead. And also the worb boundary added before the negative lookahead is a much needed one. (\d+)\s+(\S+)\s+(\w+)\s+\w+\s+\d*\s+\#\s+\S+\s+\d+\s+\d+\b(?!\h+EMPTY\b)\s*(.*) DEMO...

Regex pattern to get string between curly braces

php,regex,pcre

You can use this recursive regex pattern in PHP: $re = '/( { ( (?: [^{}]* | (?1) )* ) } )/x'; $str = "The quick brown {fox, dragon, dinosaur} jumps over the lazy {dog, cat, bear, {lion, tiger}}."; preg_match_all($re, $str, $matches); print_r($matches[2]); RegEx Demo...

single PCRE regex to swap '0' for '1' and '1' for '0' at a specific location in a string

regex,pcre

The presence of an other x in the string is not a problem since you don't need it: sed 'y/01/10/' input >output or without sed: tr 01 10 <input >output I'm posting this from the comment by @Casimir et Hippolyte as Community Wiki because I think it's the best answer,...

Snort rules regex matching

regex,pcre,snort

The (slightly crazy) syntax is pcre:"/regex/flags". The parentheses you wanted to put in there are superfluous anyway. You also need to escape any slash which is part of the actual regex, like in the example. alert tcp any any -> any any(msg:"PDF is being downloaded"; pcre:"/.*site\/year\d\d\d\d.pdf/i"; sid: 100003; rev:3;) ......

pcre2 UTF32 usage

c++,unicode,pcre,utf

I left pcre2 in the end, after evaluating RE2, PCRE2 and ICU, I chose ICU. Its unicode support (from what I've seen so far) much more complete than the other two. It also provides a very clean API and lots of utilities for manipulation. Importantly, like PCRE2 provides a perl...

Are negative lookbehind in regex searches possible in Geany?

regex,pcre,geany,negative-lookbehind

I got support from geany devs on freenode. Very helpful. Here is what they told me: The documented RE syntax only applies to the RE engine directly used by Geany (e.g. in Find), but the Find in Files features calls the grep tool (as configured in preferences->tools->grep), which has its...

PCRE make parameters optional

regex,preg-match,pcre

You can use this regex with optional components: ^/city-and-country-to-state(?:/(?<country>[^/\n]+)(?:/(?<city>[^/\n]+))?)?/?$ RegEx Demo Code: preg_match_all( '~^/city-and-country-to-state(?:/(?<country>[^/]+)(?:/(?<city>[^/]+))?)?/?$~um', $input, $matches ); ...

How to match all alphabet except few?

regex,pcre

You could use negative lookahead assertion. It's like a kind of subtraction. (?![aeiou])[a-z] ^ ^ | | subtract from DEMO...

Finding all the ten different digits in a random string

regex,pcre

What about this pattern to verify the string: ^\D*(?>(\d)(?!.*\1)\D*){10}$ ^\D* Starts with any amount of characters, that are no digit (?>(\d)(?!.*\1)\D*){10} followed by 10x: a digit (captured in first capturing group), if the captured digit is not ahead, followed by any amount of \D non-digits, using a negative lookahead. So...

Regex that matches contents of () with nested () in it

regex,bash,pcre

Since regex will find the longest possible match, you can just use \(.*\) If you care about nesting and want to find the outermost, e.g. for ((a)) and (b)))) you want to find ((a)) and (b), then that's a typical example of a grammar that you technically can't match with...

PCRE regex with multiples substrings

regex,pcre

Nearly. In POSIX regexes, parentheses and other special characters have to be escaped if you want them to match themselves, not to access their special function, so it has to be char const * regex = "open\\(\"([^\"]*)\", *([^\\)]*)\\).*"; In addition, if you want the captures, you have to compile the...

Regular expression to find text in specific Unicode range separated by spaces

regex,unicode,pcre

How can I change my regular expression so that it will match words in the specified Unicode range separated by spaces, but not two consecutive spaces You can use: ^[\u0370-\u03FF]+( [\u0370-\u03FF]+)*$ ...

preg_replace() doesn't work

php,regex,pcre

Do you put your changes back to config? // this is really BAD approach $content = file_get_contents("config"); $content = preg_replace(...); file_put_contents("config", $content); // or eval($content); if you want to initiate reconnect to the database with slightly different parameter you can simply move your piece of code to separate function...

Regex negative prefix and suffix

php,regex,preg-match,pcre

preg_match('/(download-)?(google-chrome|mozilla-firefox)(-free)?/', $string, $match); The ? indicates that the group before it is optional. If the prefix or suffix isn't in the string, those capture groups will be empty in $match. If you don't actually want to return the groups with the optional prefix and suffix, make them non-capturing groups by...

Extracting the first decimal value between XML tags

regex,xml,perl,notepad++,pcre

Try the following PCRE regex: (?<=<value>)(\d+\.\d+)(?=\/) Debuggex Demo (with the sample input string in the question) Key points: (?<=\<value\>) – a positive lookbehind ((?<=...)) for <value> – i.e. that <value> precedes the next portion of the input string to match (\d+\.\d+) – a decimal number in the form of one...

What are zero width independent subexpressions in regexes?

regex,pcre

The (?>...) is an atomic group: An atomic group is a group that, when the regex engine exits from it, automatically throws away all backtracking positions remembered by any tokens inside the group. Atomic groups are non-capturing. -- http://www.regular-expressions.info/atomic.html And, as Tim points out, atomic groups are not zero-width. The...

Using regex to parse a delimited array in bash

regex,bash,svg,grep,pcre

A sed solution can be like $ sed -r '/points=/ s/[^,]+,?([0-9]*)/\1 /g' input 287 470 509 459 471 OR for much better handling $ sed -r '/points=/ s/.*points=("[^"]+").*/\1/g; s/[^,]+,?([0-9]*)/\1 /g' input 287 470 509 459 471 ...

Regex expression for a rule in yii model validation

php,regex,yii,pcre

Looks like your asterisk is in the wrong place /^dev_([0-9]{3})_<([0-9a-zA-Z \- _])>*$/ should be /^dev_([0-9]{3})_<([0-9a-zA-Z \- _]*)>$/ This will mean that your character set [0-9a-zA-Z \- _] should match zero or more of those characters...

Slow Ruby Regex Becomes Fast with Odd Change

ruby,regex,pcre

The first pattern is slow because it starts with an alternation and the first branch of the alternation is very permissive since it allows any number of words characters or dots or hyphens. Consequence, this alternation takes a lot of time/steps before failing. The second pattern is faster because (?:[^\w+.-]|^)...

Improve Regex for match valid values

javascript,regex,pcre

For a start for the first one: ^(?:(?:(?:00|\+)58|0)?(?:2(?:12|4[0-9]|5[1-9]|6[0-9]|7[0-8]|8[1-35-8]|9[1-5]|3[45789]))\d{7})|(?:\(212\)-?\d{3}\.\d{2}\.\d{2})$ RegEx101 … for the second one: ^(?:(?:(?:00|\+)58)(?:4(?:1[246]|2[46]))\d{7})|(?:0?\d{10})|(?:\(4(?:[12]4|12)\)\d{3}\.\d{2}\.\d{2})$ RegEx101 For the last one, more input is required, how to differentiate between valid and invalid values. RegEx101...

PCRE regex whitespace

regex,pcre,pcregrep

The problem here is partial matches as you have not used an end anchor $. In case of example 3 ; There will be partial matching upto ; done by \s*.In the other regex you have disabled \s so it will not capture the space and partial match is disabled....

How to prepare a regex that will escape partial digits while matching?

c++,regex,pcre

Use look-behind to assert that the previous character is not digit: (?<!\d)([0-7]\d{2})- ...

Limit matching results of grep regex multi-line search to one

regex,linux,command-line,grep,pcre

Here's something for sed: /<SpecificTag>/,/<\/SpecificTag>/ { /<SpecificTag>/ { s/.*<SpecificTag>// } /<\/SpecificTag>/ { s/<\/SpecificTag>.*// p q } p } Put that in a file, say foo.sed, and use sed -n -f foo.sed filename.xml. The way this works is as follows: /<SpecificTag>/,/<\/SpecificTag>/ { means that all this only happens for lines between...

Detecting a double-quote-enclosed string with double-quote and backslash escaping, in a Perl Compatible Regular Expression

regex,pcre

Here you go: "(?:\\.|[^"])*" Demo For each character in the string, match either a backslash followed by anything, or a character that is not a quote. And if you need something optimized, here's an alternative: "(?>[^\\"]++|\\.)*+" Demo It basically uses possessive quantifiers to avoid backtracking....

Combine Multiple Regex Groups in one Regex

php,regex,pcre

Based on what I understood You can use alternations | to combine the different regex \b ((?<CASE_UPPER>[[:upper:]]+)| (?<CASE_INITIALCAPS>[A-Z][a-z]+[A-Z]*+)| (?<CASE_MIXED>[A-z]*[A-Z][A-z]*+)| (?<PHRASE>[A-Z][\w-]*(\s+[A-Z][\w-]*)+)) \b Regex Example...

Regexp to match but don't capture middle part

sublimetext2,pcre

Assuming sublime uses capturing groups you can use: (@).*?(keyframes) regex capturing groups are indicated by parentheses. (From phone will try to clean up later)...

How to preserve the number of characters during a sed replacement

regex,replace,sed,pcre

Unfortunately, sed doesn't have a pcre regex engine. with perl To have advanced regex features you can use perl in command line: perl -pe 's/(?:\G(?!\A)|account )\K\d(?=\d{4})/x/g' <<< 'account 123499029 account 12345 account 99999200193' details: (?: # open a non-capturing group \G # position after the previous match or start of...

Regex (pcre) - Parse ini-like file, between different nested, unquoted delimiters

regex,parsing,pcre

I get the feeling I'm using regex wrong. Indeed! There's a much simpler way: you can use regex to tokenize that input, then use some code to make sense of it. Here's a pattern for you (with x): ^@\s*(?<heading>\w+)\s*?$ |^\[(?<section>\w+)\]\s*?$ |^(?<field>\w+):\s*(?:"(?<quotedstr>(?:\\.|[^\\"]++)*+)"|(?<barestr>.*?))\s*?$ Now take a look at this demo: See...

(PCRE Regex) How to match up to string (a) unless string (b) preceds it?

html,regex,pcre

I understand you want to match tag containing text "click here" and maybe another tags inside. Also you need to avoid situation when this is matched: <a href="#">Hi there</a> <a href="#">Hi, <b>click here</b></a> but rather match only second <a href="#">Hi, <b>click here</b></a> what you need is make sure, there is...

“'\w' is an unrecognized escape” in grep [duplicate]

regex,r,grep,pcre

You need to escape the backslashes one more time in r. d$SomeColumn[grep("(?ix)<VNW[^;]*;(dis|dat)> \\w*<N\\(", d$Right, perl=TRUE)] <- 1 | | ...

grep and return line continuations?

regex,grep,pcre,pcregrep

Maybe you are looking for something like this where you search in your file after replacing slashes with newlines... tr '\\' '\n' < input.txt | grep something ...

Match group again without pasting it again (eg. ABA)

regex,pcre

Basically you can refer to a subpattern (a capture group) with its number: /([0-9]{3})...((?1))/ # or /([0-9]{3})...(\g<1>)/ # oniguruma syntax or you can use a relative reference too: /([0-9]{3})...(?-1)/ # -1 means the last opened capture group on the left /([0-9]{3})...\g<-1>/ # (oniguruma) # or if there's an other opened...

I have a list of 9 random numbers, how do I find duplicates with RegEx?

regex,parsing,random,numbers,pcre

You can use "back-references" to solve this... /(^|[^0-9])([0-9]+)(?=[^0-9]).*[^0-9]\2([^0-9]|$)/ The meaning is (^|[0-9]) Start of the string OR a non-numeric ([0-9]+) one or more digits (?=[^0-9]) a non-numeric char, but don't take it .*anything [^0-9] a non-numeric char \2 the same thing we found at 2 (it's the second parenthesized expression)...

how to implement template with conditions with preg_replace

php,regex,preg-replace,preg-match,pcre

$available_functions = array('ucfirst', 'strtolower', 'strtoupper', 'lcfirst'); $registry = array( 'profession_name' => 'actor', 'total_found' => 100, ); $template = 'Profession {"is":strtoupper} {profession_name:strtoupper:lcfirst}. [{total_found}, {total_found} vacancies found, No vacancies found]. {profession_name} in your town'; $pattern = <<<'EOD' ~ (?=[[{]) # speed up the pattern by quickly skipping all characters that are not...

Match a given sequence only if not between quotes, taking escaped quotes into consideration

php,regex,pcre

Using pcre could skip the quoted stuff: (?s)".*?(?<!\\)"(*SKIP)(*F)|\bsection\b In string regex pattern have to triple-escape the backslash, like \\\\ to match a literal backslash in the lookbehind. Or in a single quoted pattern double escaping it would be sufficient for this case. $pattern = '/".*?(?<!\\\)"(*SKIP)(*F)|\bsection\b/s'; See test at regex101....

RegEx to validate a comma separated list of options

javascript,php,regex,pcre

Sometimes regular expressions just make things more complicated than they should be. Regular expressions are really good at matching patterns, but when you introduce external rules that have dependencies on the number of matched patterns things get complicated fast. In this case I would just split the list on comma...

Regex to pull 2nd-level domain out of text

regex,pcre

Change your regex to this: (?i)https?:\/\/(?:[^. ]+\.)*(?P<test>[\w-]+\.[\w-]+) RegEx Demo...

regex to identify string and numbers from data

regex,pcre

For all dollar matches. ^\w+\s+(\w+)|(\d+\$) with the global tag. DEMO The idea is that you want two separate matches, as when you use line-related characters ^ or $, or even * and + in some cases, you are likely to get only first or final match. After you get your...

Validation rule using regex

regex,pcre

It would be nice if something like /(.*[^\/])\/*(\?.*)?/ worked. But the problem is that the regex engine will find the best possible match for (.*[^\/])\/*, even if this means matching (\?.*)? against the empty string.* You could do the following: /(.*[^\/])\/*(\?.*)|(.*[^\/])/ This is slightly unsatisfactory in that you get 3...

PCRE regex replace a text pattern within double quotes

regex,notepad++,pcre

If your data have variable number of occurrences of quoted strings, then it is not possible to perform replacements only via regex at least in its form offered by Notepad++. To replace using regex, you would need to perform regex find in existing regex match. As far as I know...

If-else in recursive regex not working as expected

php,regex,pcre

The simplest solution is to use alternatives instead: <(?:a|"a")> ([^<]++ | (?R))* </a> But if you really don't want to repeat that a part, you can do the following: <("?)a\1> ([^<]++ | (?R))* </a> Demo I've just put the conditional ? inside the group. This time, the capturing group always...

Escaping regular expressions in PHP

php,regex,escaping,pcre

The pound symbol was the issue here, replacing it to an exclamation mark solved the problem. Working expression: $pattern = '!((http|ftp|https):\/\/)?([\w\-_]+(?:(?:\.[\w\-_]+)+))([\w\-\.,@?^=%&:/~\+#]*[\w\-\@?^=%&/~\+#])?!'; For some reason this is working fine with no escape functions....

How to perl regex match in R in the grepl function?

r,function,pattern-matching,pcre,grepl

You just need to properly escape the slash in your regex ff<-function(x) grepl('\\bx\\b',x, perl=T) ff(c("axa","a x a", "xa", "ax","x")) # [1] FALSE TRUE FALSE FALSE TRUE ...

How to match a string before something or the end of line?

regex,pcre

Make the .+ lazy by adding a ? and add the alternation to $ (?<MyGroup> 4\/(.+?)(?=(?: \d\/)|$)) (.+?) The lazy matching causes the regex engine to stop once it sees the first \d/ than continuing to the end of the string. Regex Demo...

How to match a group of value to group 1

regex,pcre

The issue with your regex is that group 1 and group 2 are enclosed within a non-capturing group. This caused the entire regex to get captured with group 0. And the other thing is the the positive-lookahead prevented the regex to do a global match. Below regex will gather all...

PCRE_UTF8 Modifier Extremely Slow

php,regex,utf-8,pcre

PCRE checks for UTF validity before any other processing takes place. From the PCRE docs: When the PCRE2_UTF option is set, the strings passed as patterns and subjects are (by default) checked for validity on entry to the relevant functions. If an invalid UTF string is passed, an negative error...

Match or remove string that occurs multiple times within two strings with regex

regex,pcre

You can do that: $pattern = '~(?:"NULL","0","0","0",",","1",",","|(?!^)\G)[^"]+\K","(?!D\$)~'; $csv = preg_replace($pattern, ',', $csv); pattern details: ~ # delimiter (?: "NULL","0","0","0",",","1",","," | (?!^)\G # anchor for the end of the last match ) [^"]+ # content between quotes \K # removes all on the left from match result "," # "," (?!D\$)...

php preg_match_all conditional pattern

php,regex,matching,pcre

I would use @Casimir's answer for this purpose.. If you are looking for a regex.. use the below pattern: <script[^>]*id="(?!pagespeed\b)[^"]+".*<\/script> See DEMO...

Use Regex to match a set of numbers with variable positioning?

regex,pcre

As far as i understand your input is : p1 v1 p2 v2 p3 v3 p4 v4 where p is the position and v is the value of the digit. Now if that is the case, arrange them in increasing order of position so i would assume p1 < p2...

Matching multiline string at command line: return certain line if pattern matches, otherwise return empty string

regex,unix,command-line,pcre,pcregrep

I thought the following would work, but I could not get a look-behind expression to work if it contained a newline. mycmd | pcregrep -M '(?<=^/ > -{7}\n).*\n(?=/ > $)' But the following two stage solution worked for me: mycmd | pcregrep -M '^/ > -{7}\n.*\n/ > $' | pcregrep...

Matching a^n b^n c^n for n > 0 with PCRE

regex,pcre

Qtax trick (The mighty self-referencing capturing group) ^(?:a(?=a*(\1?+b)b*(\2?+c)))+\1\2$ This solution is also named "The Qtax trick" because it uses the same technique as from "vertical" regex matching in an ASCII "image" by Qtax. The problem in question burns down to a need to assert that three groups are matched of...

How do I replace a comma separated list of strings with the first occurrence?

php,regex,pcre

Replace should start at COALESCE but the previous part must match. Use \K for resetting after (where replacing should start). So the first part could look like MATCH\s*\(\K here starts replacing and pattern continues: \s*COALESCE\s*\(\s*([^,)]+)... capturing in first capture group. And the optional part (?:,[^)]*)? to meet a closing bracket....

Regular expression optimization

python,regex,optimization,pcre

The first step is to get rid of the unneeded reluctant (a.k.a. "lazy") quantifiers. According to RegexBuddy, your regex: ^(.+?)\|[\w\d]+?\s+?(\d\d\/\d\d\/\d\d\d\d\s+?\d\d:\d\d:\d\d\.\d\d\d)[\s\d]+?\s+?(\d+?)\s+?\d+?\s+?(\d+?)$ ...takes 6425 steps to match your sample string. This one: ^(.+?)\|[\w\d]+\s+(\d\d\/\d\d\/\d\d\d\d\s+\d\d:\d\d:\d\d\.\d\d\d)[\s\d]+\s+(\d+)\s+\d+\s+(\d+)$ ...takes 716 steps. Reluctant quantifiers reduce backtracking by doing more work up front. Your regex wasn't prone to excessive...

Can conditionals be used to pair balance group elements?

regex,recursion,pcre

I would say that there is no use polluting the logic with a bunch of named groups and unnecessary recursion that is unpredictable. There are three main parts to maintaining a proper recursion (listed below). To do it right, you must parse everything, so you have to take into account...

parsing html using pcre in python re module [closed]

python,pcre

temp=re.finditer(r"<p>[^\"\&\;]*?<\/p>\s*<ul>\s*<li>\d(.|\s)*?<\/ul>",html) for match in temp: print match.group(0) ...

Non-greedy nginx location regex returning a 404 error

regex,nginx,pcre

A single ~ was the source of all the pain: In order to use regular expressions, you must always use a prefix: "~" must be used for case sensitive matching "~*" must be used for case insensitive matching So: location ~ /api/v2/([^/]*)/region { proxy_pass http://SOME-URL/api/v2/$1/region; } Did the trick -...

Regular Expressions to match PHP variable and string concatenation

php,regex,string,variables,pcre

Next to the two given suggestions, if you're looking for PHP PCRE based regexes to validate a subset of PHP, this can be done more structured by specifying named subpatterns for the tokens you're looking for. Here is an exemplary regular expression pattern that's looking for these patterns even allowing...

Escaping a square bracket in a MongoDB regex / PCRE

regex,mongodb,pcre

Using backslash \ in front of the square bracket as below : db.collection.find({"x":{"$regex":"\\[text"}}) db.collection.find({"x":{"$regex":"^\\[text"}}) Or db.collection.find({"x":{"$regex":"\\\\[text"}}) db.collection.find({"x":{"$regex":"^\\\\[text"}}) It returns those documents which starts with [text For ex: In documents contains following data { "_id" : ObjectId("55644128dd771680e5e5f094"), "x" : "[text" } { "_id" : ObjectId("556448d1dd771680e5e5f099"), "x" : "[text sd asd "...

Regex excluding string

php,regex,pcre

You can insert this negative lookahead at start of your regex: (?!.*?\.onion) i.e. ~(?i)\b(?!.*?\.onion)((?:[a-z][\w-]+:(?:/{1,3}|[a-z0-9%])|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:\'".,<>?«»“”‘’]))~ ...

Getting all possible matches from a regex capture group

php,regex,pcre

My sincere advice to you is not to handle that (HTML) using regular expressions , simply use a DOM Parser instead. The code.. <?php $string = '<form action="post.php" method="post" name="form1">'; $dom = new DOMDocument; $dom->loadHTML($string); foreach ($dom->getElementsByTagName('form') as $ftag) { if ($ftag->hasAttributes()) { foreach ($ftag->attributes as $attribute) { $attrib[$attribute->nodeName] =...

PCRE and PHP - Escaping meta-character

php,preg-match,pcre

Single quote strings are not processed (not to the extent of double quote strings, to be precise) and are taken "as-is", but when a string is specified in double quotes, more special characters are available and variables are parsed within it. If a dollar sign ($) is encountered in a...

Regex for phone number with specific area code and format

php,regex,pcre

Actually this is somewhat hard-coded because you are having some specific requirements like validating 868-000-0000 but not 868-0000000. Regex is: (^\(868\)\ \d{3}\ \d{4})|(^868\d{7})|(^868\-\d{3}\-\d{4})|(^868\ \d{3}\-\d{4}) DEMO...

Converting PCRE to POSIX regular expression

mysql,regex,pcre,posix-ere

The MySQL docs state that: MySQL uses Henry Spencer's implementation of regular expressions, which is aimed at conformance with POSIX 1003.2. MySQL uses the extended version to support pattern-matching operations performed with the REGEXP operator in SQL statements. Ok, so we're talking about POSIX ERE. This page lists the details...

php preg_split a text without loose ,.: and so forth

php,regex,pcre,preg-split

You can use this lookahead based regex: $str = 'I search 1, regex to: no. Or... yes!'; $tok = preg_split('/\h+|(?<!\W)(?=\W)/', $str); print_r($tok); Array ( [0] => I [1] => search [2] => 1 [3] => , [4] => regex [5] => to [6] => : [7] => no [8] =>...

What is the correct syntax for a Regex find-and-replace using REGEXP_REPLACE in MariaDB?

mysql,regex,pcre,mariadb

You have to do a lot of escaping here: REGEXP_REPLACE(message, "\\[quote\\sauthor=(.+)\\slink=[^\\]]+]", "\\[quote=\"\\1\"\\]") Please note that you have to reference the Group by \\1...

How to exclude part of alternative from capture?

python,regex,python-2.7,pcre,regular-language

If you don't mind multiple capture groups (and therefore slightly altering the rest of the code), it's super easy - just do the opposite of what you're doing. (?:(description|speed|type|peers)\s+set|(classify)) as seen in https://regex101.com/r/bR1nV7/1 If you don't want it, you can use lookarounds. ((?:description|speed|type|peers)(?=\s+set)|classify) as seen in https://regex101.com/r/bR1nV7/2 There is no...

Matching both greedy, nongreedy and all others in between [duplicate]

regex,pcre

Felipe not looking for /foo/bar/baz, /bar/baz, /baz but for /foo, /foo/bar, /foo/bar/baz One solution building on regex idea in comments but give the right strings: reverse the string to be matched: xuuq/zab/rab/oof/ For instance in PHP use strrev($string ) match with (?=((?<=/)(?:\w+/)+)) This give you zab/rab/oof/ rab/oof/ oof/ Then reverse...