Menu
  • HOME
  • TAGS

more efficient regex to parse linux top command values

Tag: regex,linux,perl,optimization

I'm trying to grab some measurements on a per process level in this script I'm writing. The easiest way to see the values I'm looking for is to just grab the output of the top command.

So when I try to parse it though, my regex looks kind of ridiculous. Given this output:

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 8364 cgroup_t  20   0  764m 646m 1520 R 101.7  4.3   0:05.51 perl

I came up with the regex to grab some values(the 8364 is passed in on a var and shown here for ease of reading and the top output is stored on a var called $top_string):

if($top_string =~ m/^\s*8364\s+([^\s]+)\s+([^\s]+)\s+([^\s]+)\s+([^\s]+)\s+([^\s]+)\s+([^\s]+)\s+([^\s]+)\s+([^\s]+)\s+([^\s]+)\s+([^\s]+)\s+([^\s]+)/){
    #return desired var number, ie.  $1,$2...etc
}

This works but it seems like overkill. Is there any way to do this more efficiently? I feel like maybe I remember a way to avoid typing the \s+([^\s]+) pattern over and over.

Anyway thanks for taking the time to read this!

Cheers

Best How To :

use split when you have delimiter

my @cols = split ' ', ( $top_string =~ /(\d.+)/ )[0];

Please can someone help me understand the exec method for regular expressions?

javascript,regex

I don't understand why it would give me two hellos back? Because the first entry in the array is the overall match for the expression, which is then followed by the content of any capture groups the expression defines. Since the expression defines one capture group, you get back...

ret_from_syscall source code and when it is called

linux,linux-kernel,kernel,linux-device-driver,system-calls

The ret_from_syscall symbol will be in architecture-specific assembly code (it does not exist for all architectures). I would look in arch/XXX/kernel/entry.S. It's not actually a function. It is part of the assembly code that handles the transition from user-space into kernel-space for a system call. It's simply a label to...

jQuery / Regex: How to compare string against several substrings

jquery,regex,string,substring,substr

You could convert this to a slightly more maintainable format, without getting into regular expressions. This is one way to use an array to accomplish your goal: // Super-quick one-liner: var str = '2042038423408'; var matchCount = $.grep(['12', '23', '34', '45', '56', '67', '78', '89', '90', '01'], function(num, i) {...

Reg ex matching a word

regex

You could use a negative lookahead which will exclude those having _FX following the initial alpha string ^ABD_DEF_GHIJ(?!_FX)(?:_\d{8})?$ see example here...

Get number from string

regex

Use \d+ to match one or more digits. \b(?:http:\/\/)?(?:www\.)?example\.com\/g\/(\d+)\/\w put http:// and www. inside a capturing or non-caturing group and then make it as optional by adding ? quantifier next to that group. For both http and https, it would be (?:https?:\/\/)? DEMO...

How can I resolve the “Could not fix timestamps in …” “…Error: The requested feature is not implemented.”

linux,build,f#

This is usually a sign that you should update your mono. Older mono versions have issues with their unzip implementation

Finding embeded xpaths in a String

java,regex

Use {} instead of () because {} are not used in XPath expressions and therefore you will not have confusions.

How do I isolate the text between 2 delimiters on the left and 7 delimiters on the right in Python?

python,regex,string,split

You can use python's built-in csv module to do this. j = next(csv.reader([string])); Now j is each item delimited by a , and will include commas if the value is wrapped in ". See j[2]....

Ignore first few lines and last few lines in a file Linux

linux,awk

awk cannot look ahead so you'll have to save the lines. awk 'NR>2{if(z!="")print z;z=y;y=x;x=$0}' file Practically zero memory overhead...

Regex that allow void fractional part of number

c#,regex

Just get the dot outside of the captruing group and then make it as optional. @"[+-]?\d+\.?\d*" Use anchors if necessary. @"^[+-]?\d+\.?\d*$" ...

Django MySQLClient pip compile failure on Linux

python,linux,django,gcc,pip

It looks like you're missing zlib; you'll want to install it: apt-get install zlib1g-dev I also suggest reading over the README and confirming you have all other dependencies met: https://github.com/dccmx/mysqldb/blob/master/README Also, I suggest using mysqlclient over MySQLdb as its a fork of MySQLdb and what Django recommends....

Get all prices with $ from string into an array in Javascript

javascript,regex,currency

It’s quite trivial: RegEx string.match(/\$((?:\d|\,)*\.?\d+)/g) || [] That || [] is for no matches: it gives an empty array rather than null. Matches $99 $.99 $9.99 $9,999 $9,999.99 Explanation / # Start RegEx \$ # $ (dollar sign) ( # Capturing group (this is what you’re looking for) (?: #...

Using an ad-hoc libc with a tool which is an argument of another tool

linux,shared-libraries

You can achieve that by using the env utility: timeout 10 /usr/bin/env LD_LIBRARY_PATH=/path/to/mod/libc/ cp a b Env will set the environment variable and exec the other utility with that environment....

Bash modify CSV to change a field

linux,bash,awk

Please save following awk script as awk.src: function date_str(val) { Y = substr(val,0,4); M = substr(val,5,2); D = substr(val,7,2); date = sprintf("%s-%s-%s",Y,M,D); return date; } function time_str(val) { h = substr(val,9,2); m = substr(val,11,2); s = substr(val,13,2); time = sprintf("%s:%s:%s",h,m,s); return time; } BEGIN { FS="|" } # ## MAIN...

Python regular expression, matching the last word

python,regex,list

Use the alternation with $: import re mystr = 'HelloWorldToYou' pat = re.compile(r'([A-Z][a-z]*)') # or your version with `.*?`: pat = re.compile(r'([A-Z].*?)(?=[A-Z]+|$)') print pat.findall(mystr) See IDEONE demo Output: ['Hello', 'World', 'To', 'You'] Regex explanation: ([A-Z][a-z]*) - A capturing group that matches [A-Z] a capital English letter followed by [a-z]* -...

How to match words in 2 list against another string of words without sub-string matching in Python?

python,regex,string,loops,twitter

Store slangNames and riskNames as sets, split the strings and check if any of the words appear in both sets slangNames = set(["Vikes", "Demmies", "D", "MS", "Contin"]) riskNames = set(["enough", "pop", "final", "stress", "trade"]) d = {1: "Vikes is not enough for me", 2:"Demmies is okay", 3:"pop a D"} for...

javascript replace dot (not period) character

javascript,regex,replace

Try using the unicode character code, \u2022, instead: message.replace(/\u2022/, "<br />\u2022"); ...

Python match whole file name, not just extension

python,regex,nsregularexpression

You're not capturing the whole filename in the group. You can also use noncapturing groups with (?:...). .*\.(rom|[0-9]{3})+ # from this (.*\.(?:rom|[0-9]{3})) # to this ...

How many characters are visible like a space, but are not space characters?

php,regex

You can make use of a Unicode category \p{Zs}: Zs    Space separator $string = preg_replace('~\p{Zs}~u', ' ', $string); The \p{Zs} Unicode category class will match these space-like symbols: Character Name U+0020 SPACE U+00A0 NO-BREAK SPACE U+1680 OGHAM SPACE MARK U+2000 EN QUAD U+2001 EM QUAD U+2002 EN SPACE U+2003 EM SPACE...

Warning: preg_match_all(): Unknown modifier '\' [duplicate]

php,regex,warnings

Use a different set of delimiters for the regex. For example, you can write preg_match_all('~[^/\s]+/\S+\.(jpg|png|gif)~', $string, $results ...

Java - Enforce TextField Format - UX - 00:00:00;00

java,regex,user-interface

How about using JFormattedTextField with MaskFormatter. JFormattedTextField formattedTextField = new JFormattedTextField("00:00:00;00"); try { MaskFormatter maskFormatter = new MaskFormatter("##:##:##;##"); maskFormatter.install(formattedTextField); } catch (ParseException e) { e.printStackTrace(); } More info at http://docs.oracle.com/javase/tutorial/uiswing/components/formattedtextfield.html Demo code: JFrame frame = new JFrame(""); frame.setDefaultCloseOperation(JFrame.EXIT_ON_CLOSE); JPanel panel = new JPanel(); JFormattedTextField...

Extracting strings from HTML with Python wont work with regex or BeautifulSoup

python,regex,parsing,beautifulsoup,python-requests

In order to match the string with a literal backlash, you need to double-escape it in a raw string, e.g.: re.search(r'@CAD_DTA\\">(.+?)@[email protected]@CAD_LBL',result.text) ^ ^ In order to get the index of the found match, you can use start([group]) of re.MatchObject IDEONE demo: import re obj = re.search(r'@CAD_DTA\\">(.+?)@[email protected]@CAD_LBL', 'Some text [email protected]_DTA\\">I WANT...

BeautifulSoup: Parsing bad Wordpress HTML

python,html,regex,wordpress,beautifulsoup

At least, you can rely on the tag names and text, navigating the DOM tree horizontally - going sideways. These are all strong, p and span (with id attribute set) tags you are showing. For example, you can get the strong text and get the following sibling: >>> from bs4...

PHP Regular Expressions Counting starting consonants in a string

php,regex

This is one way to do it, using preg_match: $string ="SomeStringExample"; preg_match('/^[b-df-hj-np-tv-z]*/i', $string, $matches); $count = strlen($matches[0]); The regular expression matches zero or more (*) case-insensitive (/i) consonants [b-df-hj-np-tv-z] at the beginning (^) of the string and stores the matched content in the $matches array. Then it's just a matter...

match line break except line begin with spcific word or blank line

regex,notepad++

Try this regex: (?<=[a-zA-Z])(\n) I used parentheses to capture the newline character. https://regex101.com/r/zS9pB4/3...

Regex pass dynamic values with boundry

c#,regex,string,boundary

Your first regular expression has a black slash followed by the letter b because of that @. The second one has the character that represents backspace. Just put an @ in front string bound = @"\b"; ...

What does it indicate if /proc/PID/maps shows zero for all addresses?

linux,linux-kernel

I found the discussion in Valgrind mail list when someone had the same problem. The issue was that the kernel have been patched with PaX patches, one of which doesn't allow to look at the /proc/pid/maps. The quote about the patch from wikipedia The second and third classes of attacks...

AWK write to new column base on if else of other column

linux,bash,shell,awk,sed

You can use: awk -F, 'NR>1 {$0 = $0 FS (($4 >= 0.7) ? 1 : 0)} 1' test_file.csv ...

NASM: copying a pointer from a register to a buffer in .data

linux,assembly,nasm,x86-64

The problem is, you don't have debug info for the ptr type, so gdb treats it as integer. You can examine its real contents using: (gdb) x/a &ptr 0x600124 <ptr>: 0x7fffffffe950 (gdb) p/a $rsp $3 = 0x7fffffffe950 Of course I have a different value for rsp than you, but you...

Match a pattern preceded by a specific pattern without using a lookbehind

regex,eclipse,lookahead

A work-around for the lack of variable-length lookbehind is available in situations when your strings have a relatively small fixed upper limit on their length. For example, if you know that strings are at most 100 characters long, you could use {0,100} in place of * or {1,100} in place...

Delete some lines from text using Linux command

linux,shell,sed,grep,pattern-matching

The -v option to grep inverts the search, reporting only the lines that don't match the pattern. Since you know how to use grep to find the lines to be deleted, using grep -v and the same pattern will give you all the lines to be kept. You can write...

While loop in bash using variable from txt file

linux,bash,rhel

As indicated in the comments, you need to provide "something" to your while loop. The while construct is written in a way that will execute with a condition; if a file is given, it will proceed until the read exhausts. #!/bin/bash file=Sheetone.txt while IFS= read -r line do echo sh...

Regular Expression for whole world

regex,c#-4.0,vb6

You can use: Public\s+Const\s+g(?<Name>[a-zA-Z][a-zA-Z0-9]*)\s+=\s+(?<Value>False|True) demo ...

Does there exist an algorithm for iterating through all strings that conform to a particular regex?

c#,regex,algorithm

Let's say the domain is as following String domain[] = { a, b, .., z, A, B, .. Z, 0, 1, 2, .. 9 }; Let's say the password size is 8 ArrayList allCombinations = getAllPossibleStrings2(domain,8); This is going to generate SIZE(domain) * LENGTH number of combinations, which is in...

MySQL substring match using regular expression; substring contain 'man' not 'woman'

mysql,regex

A variant of n-dru pattern since you don't need to describe all the string: SELECT '#hellowomanclothing' REGEXP '(^#.|[^o]|[^w]o)man'; Note: if a tag contains 'man' and 'woman' this pattern will return 1. If you don't want that Gordon Linoff solution is what you are looking for....

How to Match a string with the format: “20959WC-01” in php?

php,regex

$pattern = '! ^ # start of string \d{5} # five digits [[:alpha:]]{2} # followed by two letters - # followed by a dash \d{2} # followed by two digits $ # end of string !x'; $matches = preg_match($pattern, $input); ...

Syncing Vagrant VMs across different physical servers

linux,vagrant,backup,virtual-machine,sync

Vagrant doesn't inherently support this, since it's intended audience is really development environments. It seems like you're looking for something more like what VMWare vSphere does.

REGEX python find previous string

python,regex,string

Updated: This will check for the existence of a sentence followed by special characters. It returns false if there are no special characters, and your original sentence is in capture group 1. Updated Regex101 Example r"(.*[\w])([^\w]+)" Alternatively (without a second capture group): Regex101 Example - no second capture group r"(.*[\w])(?:[^\w]+)"...

Validate part of mail suffix

c#,regex

You can use this regex to test. It will ensure that after the @ there is .xx. but may also match the string @.xx.* .*@[^.]*[.]xx[.] Or this one to ensure that there is at least one character before and after the @. [email protected][^.]+[.]xx[.] ...

Swing regular expression for phone number validation

java,regex

To only allow digits, comma and spaces, you need to remove (, ) and -. Here is a way to do it with Matcher.find(): Pattern pattern = Pattern.compile("^[0-9, ]+$"); ... if (!m.find()) { evt.consume(); } And to allow an empty string, replace + with *: Pattern pattern = Pattern.compile("^[0-9, ]*$");...

How to create the javascript regular expression for number with some special symbols

javascript,regex

This matches all given examples as well: ^\$?\d+(?:[.,:]\d+)?%?$ See it in action: RegEx101 Please comment, if adjustment / further detail is required....

What are correct permissions for Linux Apache2 PHP 5.3 log file?

php,linux,apache,logging,permissions

I'd simply set its owner to apache user. This will give you the name of apache user : ps aux | grep httpd In my case (CentOS), it's 'apache' but sometimes it's 'www-data'... chown apache:apache /var/log/httpd/php_errors.log chmod 600 /var/log/httpd/php_errors.log ...

Regex not working in HTML5 pattern

regex,html5

The pattern attribute has to match the entire string. Assertions check for a match, but do not count towards the total match length. Changing the second assertion to \w+ will make the pattern match the entire string. You can also skip the implied ^, leaving you with just: <input pattern="(?!34)\w+"...

Identify that a string could be a datetime object

python,regex,algorithm,python-2.7,datetime

What about fuzzyparsers: Sample inputs: jan 12, 2003 jan 5 2004-3-5 +34 -- 34 days in the future (relative to todays date) -4 -- 4 days in the past (relative to todays date) Example usage: >>> from fuzzyparsers import parse_date >>> parse_date('jun 17 2010') # my youngest son's birthday datetime.date(2010,...

Regex to remove `.` from a sub-string enclosed in square brackets

c#,.net,regex,string,replace

To remove all the dots present inside the square brackets. Regex.Replace(str, @"\.(?=[^\[\]]*\])", ""); DEMO To remove dot or ?. Regex.Replace(str, @"[.?](?=[^\[\]]*\])", ""); ...

Force linux to use php as php55

php,linux,fedora

You can create an alias: alias php="php55" Now if you type php it uses php55...

Store regex pattern as a string in PHP when regex pattern contains both single and double quotes

php,regex

The quotes are an issue but not the issue you are running into when you escape them. Your delimiter is terminating your regex just before the closing a which is giving you the unknown modifier error. It appears you don't have error reporting on though so you aren't seeing that....

regex - Match filename with or without extension

regex,logstash-grok

This is about as simple as I can get it: \b\w+\.?\w* See demo...

How to write RegEx for inserting line break for line length more than 30 characters?

regex

Find what: ^(.{30}) Replace with: \1\n ...

Regex with whitespaces and preceding zeros

regex,sas

You can use this simplified regex: /^[\s0]*11\s*$/ ...