I am still a noob to shell scripts but am trying hard. Below, is a partially working shell script which is supposed to remove all JS from *.htm documents by matching tags and deleting their enclosed content. E.g. ,
and
find $1 -name "*.htm" > ./patterns
for p in $(cat ./patterns)
do
sed -e "s/]//g" $p #> tmp.htm ; mv tmp.htm $p
done
The problem with this is script is that because sed reads text input line-by-line, this script will not work as expected with new-lines. Running:
will remove the first script tag but will omit the "foo" and closing tag which I don't want.
Is there a way to match new-line characters in my regular expression? Or if sed
is not appropriate, is there anything else I can use?
Answer
Assuming that you have tags on different lines, e.g. something like:
foo
bar
foo
the following should work:
sed '/