Filtrar lineas de texto repetidas

2003-08-05 00:00:00

Necesitaba un programa que eliminase las lineas repetidas de un texto, y como el uniq del UNIX es practicamente inutil, tuve que currarmelo yo (unique):

#!/usr/bin/perl @lineas = <STDIN>; foreach $linea (@lineas) { $estaba="Pono"; foreach $cadena (@repasadas) { if ($cadena eq $linea) { $estaba="Pozi"; } } if ($estaba eq "Pono") { push @repasadas, $linea; print "$linea"; } }

ACTUALIZACIí“N 23/07/2004 Usando diccionarios, _muchísimo_ más rápido.

#!/usr/bin/perl while (<STDIN>) { if (!(exists $lineas{$_})) { print "$_"; $lineas{$_}="1"; } }

Keith Amling (25/09/2005, 13:22)

uniq is designed for use with sorted text. If you sort the input first it will handle it correctly, for example

$ cat file
c
a
a
b
a
$ cat file | uniq
c
a
b
a
$ cat file | sort | uniq
a
b
c

If you need the lines sorted as they were initially then uniq is useless. While I'm on the subject of uniq, don't forget uniq -c.

Saiyine (25/09/2005, 23:41)

Yeah, that was right the problem, I didn't wanted sort to mess with the order. Thanks for this lot of commentaries!

Rollos antiguos

2003-08-07 00:00:00 - Frase al azar.

2003-10-28 00:00:00 - Generar un numero determinado de ficheros de texto.

2003-11-02 00:00:00 - Filtro antispam en una carpeta de correo Maildir.

2003-12-29 00:00:00 - Calculadora modo texto.

2004-01-28 00:00:00 - Dejar pasar desde o hasta una linea.

Saiyine

Hi! Welcome to Saiyine Punto Com where I talk about anything that goes through my mind!

Puedo prometer y prometo que a la mayor brevedad aquí irá un menú o algo asín.

SaiyinePunto Com

Filtrar lineas de texto repetidas

2003-08-05 00:00:00

Rollos antiguos

Saiyine

Saiyine
Punto Com