On Sun, 2007-12-02 at 20:03 -0500, Mark Oden wrote:
> I have a large dataset that looks similar to this:
>
> 1113531405.073 15577 91 21 12001 0.489 0.092 Complete 2895 80
> 1114032172.486 15577 469 15 7860 0.695 0.089 Complete 2895 80
>
> I would like to sort the data by column 3 followed by sorting by column
> 1. I know using the sort command I can differentiate between which
> column I want to sort by, but I don't know how to sort by one column
> then by another. Any suggestions? Also, are there any other sorting
> programs that use an algorithm other than the one provided by the sort
> program? My data size is 700 MB so I fear it may take awhile and would
> like to use a very efficient algorithm :-)
sort -ns -k 3 -k 1
Adding --buffer-size=700M (or thereabouts) may help as then sort won't
have to keep dynamically adding more memory.
I'm not sure which algorithm sort uses by default, but I wouldn't be
surprised if it were something reasonable like qsort(3).
And 700MB doesn't sound that large to me, just so long as you're not
doing it often in realtime ;)
--
Edward Allcutt <[log in to unmask]>
|