On Feb 17 Allen S. Rout wrote:
> >> On Sat, 17 Feb 2007 17:43:20 -0500, Dan Trevino <[log in to unmask]> said:
>
>
> > I need to parse a tab delimited text file of several thousand lines. The
> > first part is easy;
>
> > cut -f8 file
>
> > field 8 of this file contains multiple, variable length, sentences enclosed
> > in double quotes. Example returned by the cut command above:
>
> > "this is sentence one. this i sentence two. this ""is a quote that may be""
> > in sentence three."
>
> > I need to grab the first sentence for further processing (without the
> > period, without the beginning quote mark) into a variable, but am having
> > difficulty. Can anyone suggest an easy way to do this? I'm open to
> > bash,perl,python solutions, but prefer bash.
>
So for the example above what you want is:
this is sentence one
Is that right?
Will there always be a quote to start the line?
>
> $foo = [
> '"this is sentence one. this i sentence two. this ""is a quote that may be""in sentence three."',
> '"this is lalala. "'
> ];
>
> print join ("\n", map { /\"([^.]+)\./; $1 } @$foo ) ;
>
Putting it together with a slight twist to Allen's solution:
cut -f8 file | perl -lne 's/^"([^.]+)\./$1/ or die "no match:$_...";print
$1'
hope that helps
Jim
|