Sed regexp multiline - replace HTML

Asked
Active3 hr before
Viewed126 times

9 Answers

90%

Stack Overflow for Teams Where developers & technologists share private knowledge with coworkers ,I am attempting to replace multiple lines using sed on a Linux system ,Just as a side note, you should really try to be as specific as possible in your posting. "replaced/removed" means "replaced OR removed". If you want it replaced, just say replaced. That helps both those of us trying to answer your question and future users who might be experiencing the same issue.,Although Tim Pote has answered the question, I will just post this here just in case someone need to replace a multiline pattern:

While @nhahtdh's answer is the correct one for your original question, this solution is the answer to your comments:

sed ' /
   < !--PAGE TAG-- > /,/ < !--PAGE TAG-- > / {
1 {
   s / ^ .*$ / Replace Data /
      b
}
d
}
'

You can make any series of sed commands into one-liners with gnu sed by adding semicolons after each command (but it's not recommended if you want to be able to read it later on):

sed '/<!-- PAGE TAG -->/,/<!-- PAGE TAG -->/ { 1 { s/^.*$/Replace Data/; b; }; d; };'
load more v
88%

Stack Exchange network consists of 178 Q&A communities including Stack Overflow, the largest, most trusted online community for developers to learn, share their knowledge, and build their careers. ,In the case of the text to be replaced is very long, I would suggest ex syntax., Can I use multiple entries visa for multiple visits each year? ,You can read multiple lines into the pattern-space and manipulate things surprisingly well, but with a more than normal effort.. Sed has a set of commands which allow this type of thing... Here is a link to a Command Summary for sed. It is the best one I've found, and got me rolling.


sed '/^a test$/{
$!{
   N # append the next line when not on the last line
   s / ^ a test\ nPlease do not$ / not a test\ nBe /
      # now test
   for
   a successful substitution,
   otherwise
   # + unpaired "a test"
   lines would be mis - handled
   t sub - yes # branch_on_substitute(goto label: sub - yes): sub - not # a label(not essential; here to self document)
   #
   if no substituion,
   print only the first line
   P # pattern_first_line_print
   D # pattern_ltrunc(line + nl) _top / cycle: sub - yes # a label(the goto target of the 't'
      branch)
   # fall through to final auto - pattern_print(2 lines)
}
}
' alpha.txt  

Here it is the same script, condensed into what is obviously harder to read and work with, but some would dubiously call a one-liner

sed '/^a test$/{$!{N;s/^a test\nPlease do not$/not a test\nBe/;ty;P;D;:y}}'
alpha.txt

Here is my command "cheat-sheet"

: # label
   = # line_number
a # append_text_to_stdout_after_flush
b # branch_unconditional
c # range_change
d # pattern_delete_top / cycle
D # pattern_ltrunc(line + nl) _top / cycle
g # pattern = hold
G # pattern += nl + hold
h # hold = pattern
H # hold += nl + pattern
i # insert_text_to_stdout_now
l # pattern_list
n # pattern_flush = nextline_continue
N # pattern += nl + nextline
p # pattern_print
P # pattern_first_line_print
q # flush_quit
r # append_file_to_stdout_after_flush
s # substitute
t # branch_on_substitute
w # append_pattern_to_file_now
x # swap_pattern_and_hold
y # transform_chars
load more v
72%

Its basic concept is simple: the s command attempts to match the pattern space against the supplied regular expression regexp; if the match is successful, then that portion of the pattern space which was matched is replaced with replacement. ,The s command (as in substitute) is probably the most important in sed and has a lot of different options. The syntax of the s command is ‘s/regexp/replacement/flags’. ,Only replace the numberth match of the regexp. ,If the substitution was made, then print the new pattern space.

s / \(b\ ? \) - /x\u\1/g
load more v
65%

 sed 's/AA/0/g;s/AG/1/g;s/GG/2/g;'
 input.txt > output.txt
load more v
75%

See the regular expressions how-to document from python.org.,Refer to an overview of regular expressions from the University of Kentucky.,If you'd like to insert multiple lines before the current line, you can add additional lines by appending a backslash to the previous line, like so: ,In this example, we use the '&' character in the replacement string, which tells sed to insert the entire matched regular expression. So, whatever was matched by '.*' (the largest group of zero or more characters on the line, or the entire line) can be inserted anywhere in the replacement string, even multiple times. This is great, but sed is even more powerful.

Let's look at one of sed's most useful commands, the substitution command. Using it, we can replace a particular string or matched regular expression with another string. Here's an example of the most basic use of this command:

user $ sed - e 's/foo/bar/'
myfile.txt
load more v
40%

The search pattern is on the left hand side and the replacement string is on the right hand side. ,Repeat with the next sed expression, again operating on the pattern space. ,Passing regular expressions as arguments ,FreeBSD Extensions -a or delayed open The -I in-place argument -E or Extended Regular Expressions

Sed has several commands, but most people only learn the substitute command: s. The substitute command changes all occurrences of the regular expression into a new value. A simple example is changing "day" in the "old" file to "night" in the "new" file:

sed s/day/night/ <old>new
load more v
22%

Here in the pattern part you are matching the first 3 digits and then using & you are replacing those 3 digits with the surrounding parentheses.,You want to make the area code (the first three digits) surrounded by parentheses for easier reading. To do this, you can use the ampersand replacement character −,To substitute one string with another, the sed needs to have the information on where the first string ends and the substitution string begins. For this, we proceed with bookending the two strings with the forward slash (/) character.,The slash character (/) that surrounds the pattern are required because they are used as delimiters.

As mentioned previously, sed can be invoked by sending data through a pipe to it as follows −

$ cat / etc / passwd | sed
Usage: sed[OPTION]...{
      script - other - script
   } [input - file]...

   -n, --quiet, --silent
suppress automatic printing of pattern space -
   e script, --expression = script
   ...............................
load more v
60%

I'm importing WordPress blog posts into Pelican, in which I'm using the render-math plugin. This expects in-line equations to be delimited by single dollar signs, and displayed equations to be delimited by double dollar signs, e.g.:,The next command replaces the WordPress-style closing delimiters - \$ - that appear at the end of a line (and which are therefore most likely to be associated with displayed rather than in-line equations), with $$.,Therefore, a bit of mucking about with grep, sed, and regular expressions was required to get the equations to display in Pelican.,To begin with, look at the basic regular expression syntax needed to pick out the \$latex, \$, &s=1 and &bg=ffffff strings.

I'm importing WordPress blog posts into Pelican, in which I'm using the render-math plugin. This expects in-line equations to be delimited by single dollar signs, and displayed equations to be delimited by double dollar signs, e.g.:

 ...an example in -line equation is $E = mc ^ 2 $, which...

and:

...the following equation:

   $$ F = ma $$

is displayed as a paragraph...
grep '\\$[latex]\{5\}'
$(find. / content - name "*.md")
grep '^\\$[latex]\{5\}'
$(find content - name "*.md")
sed s / string1 / string2 / g inputfile > outputfile
sed - i s / string1 / string2 / g inputfile
sed - i s / '^\\$[latex]\{5\}' / '$$' / g inputfile
find content - name * .md | xargs sed - i s / '^\\$[latex]\{5\}' / '$$' / g
find content - name * .md | xargs sed - i s / '\\\$$' / '$$' / g
load more v
48%

The escape sequence \n matches a newline character embedded in the pattern space. A literal newline character must not be used in the regular expression of a context address or in the substitute command. , The argument text consists of one or more lines. Each embedded newline character in the text must be preceded by a backslash. Other backslashes in text are removed and the following character is treated literally. , A command line with one address selects each pattern space that matches the address. , Some of the commands use a hold space to save all or part of the pattern space for subsequent retrieval. The pattern and hold spaces will each be able to hold at least 8192 bytes.

sed[-n] script[file...]

sed[-n][-e script]...[-f script_file]...[file...]
load more v

Other "undefined-undefined" queries related to "Sed regexp multiline - replace HTML"