[regex] problem with regular expression
Hi all, I am having some troubles with making a regular expression to parse a file of financial data. First, the file looks like this (simplified): BEGIN FINANCIAL INFORMATION 0 4 John (the name is mentioned somewhere between the two tags) 6 3 8 END FINANCIAL INFORMATION BEGIN FINANCIAL INFORMATION 0 David 6 8 END FINANCIAL INFORMATION BEGIN FINANCIAL INFORMATION 2 4 7 4 0 3 John 6 8 END FINANCIAL INFORMATION I want to create a regular expression which gives me all financial informations of John. My first try was: "BEGIN FINANCIAL INFORMATION.*?John.*?END FINANCIAL INFORMATION" The problem here is that the second ".*?" also matches "END FINANCIAL INFORMATION", which means that the following substring is also a match: BEGIN FINANCIAL INFORMATION 0 David 6 8 END FINANCIAL INFORMATION BEGIN FINANCIAL INFORMATION 2 4 7 4 0 3 John 6 8 END FINANCIAL INFORMATION Can somebody please help me with this regular expression? I'm trying to find out how I can type "any character except the string 'END FINANCIAL INFORMATION'", but without succes so far. Thanks in advance. Kind regards, John
John Kiopela wrote:
Can somebody please help me with this regular expression? I'm trying to find out how I can type "any character except the string 'END FINANCIAL INFORMATION'", but without succes so far.
You could use forward lookahead for this: "(?:(?!END FINANCIAL INFORMATION).)+" which matches any sequence of characters not containing the string "END FINANCIAL INFORMATION". HTH, John.
participants (2)
-
John Kiopela
-
John Maddock