#211216 - 2016-03-18 12:12 PM
Searching for duplicate strings in CSV file
|
Sweeny
Fresh Scripter
Registered: 2016-03-18
Posts: 18
Loc: Hampshire
|
Hi all,
I'm hoping I can have a little light shed on this subject, I'm trying to write a script that searches through a massive csv (over 6GB in size!!!! ) and searches for any duplicate lines in that file. The ideal situation would be that the script reads the first line of the csv then searches the entire file for any copies of that line thus indicating duplication, then moves onto the second line etc. I'm hoping this won't add too much of a complication but id like the script to delete any duplications it finds thus leaving a substantially smaller sized csv with a single instance of the string.
Any help would be greatly appreciated.
Cheers
|
Top
|
|
|
|
#211228 - 2016-03-21 01:02 PM
Re: Searching for duplicate strings in CSV file
[Re: Jochen]
|
Sweeny
Fresh Scripter
Registered: 2016-03-18
Posts: 18
Loc: Hampshire
|
Hi all,
Thank you very much for the prompt response, I might be missing something very simple here but I keep getting [ERROR : expected ')'!] when I try to incorporate these functions into my script. Is there anything I need to change?
Cheers, Tom
|
Top
|
|
|
|
#211230 - 2016-03-21 02:24 PM
Re: Searching for duplicate strings in CSV file
[Re: Jochen]
|
Sweeny
Fresh Scripter
Registered: 2016-03-18
Posts: 18
Loc: Hampshire
|
AHHH!!! Thank you! Yeah the file size...
|
Top
|
|
|
|
#211232 - 2016-03-21 04:53 PM
Re: Searching for duplicate strings in CSV file
[Re: Sweeny]
|
Sweeny
Fresh Scripter
Registered: 2016-03-18
Posts: 18
Loc: Hampshire
|
Works a treat! Thank you all for the help, much appreciated.
Cheers
|
Top
|
|
|
|
Moderator: Jochen, Allen, Radimus, Glenn Barnas, ShaneEP, Ruud van Velsen, Arend_, Mart
|
2 registered
(morganw, mole)
and 414 anonymous users online.
|
|
|