How to remove lines based on another file? [duplicate]

Asked
Active3 hr before
Viewed126 times

8 Answers

linesremovebased
90%

You don't need to iterate it, you just need to use grep with-v option to invert match and -w to force pattern to match only WHOLE words, Reference for combinatorics with view towards representation theory/algebraic geometry , By clicking “Accept all cookies”, you agree Stack Exchange can store cookies on your device and disclose information in accordance with our Cookie Policy. ,Find centralized, trusted content and collaborate around the technologies you use most.

You don't need to iterate it, you just need to use grep with-v option to invert match and -w to force pattern to match only WHOLE words

grep - wvf file2.txt file1.txt
load more v
88%

grep - v - f file2.txt file1.txt
load more v
72%

Are the duplicated lines adjacent to one another? Is the output to remain in the same order or would it be ok to sort the data? – Kusalananda ♦ May 19 '18 at 13:14 ,Based on your comments you want the result to be the same output file without having to create another output or append into the new file, you could use the following:,tac reverses lines in the file, and awk will only output the first occurrence of a duplicate line., See my answer without creating another output to store the result – MaXi32 Jun 6 '20 at 8:46

Demo file stuff.txt contains:

one
two
three
one
two
four
five

Remove duplicate lines from a file assuming you don't mind that lines are sorted

$ sort - u stuff.txt
five
four
one
three
two

Remove duplicate lines from a file, preserve original ordering, keep the first:

$ cat - n stuff.txt | sort - uk2 | sort - nk1 | cut - f2 -
   one
two
three
four
five

Remove duplicate lines from a file, preserve order, keep last.

tac stuff.txt > stuff2.txt;
cat - n stuff2.txt | sort - uk2 | sort - nk1 | cut - f2 - > stuff3.txt;
tac stuff3.txt > stuff4.txt;
cat stuff4.txt
three
one
two
four
five
load more v
65%

how can i remove duplicate lines permarently from my unix file because using uniq command it is deleted temporarily.,The sort command is used to order the lines of a text file and uniq filters duplicate adjacent lines from a text file. These commands have many more useful options. I suggest you read the man pages by typing the following man command: man sort man uniq, sartyaki Feb 18, 2017 @ 15:38 how can i remove duplicate lines permarently from my unix file because using uniq command it is deleted temporarily. Link ,sort command– Sort lines of text files in Linux and Unix-like systems.

Here is a sample test file called garbage.txt displayed using the cat command:
cat garbage.txt
Sample outputs:

this is a test
food that are killing you
wings of fire
we hope that the labor spent in creating this software
this is a test
unix ips as well as enjoy our blog
load more v
75%

comm is a utility command that works on lexically sorted files. It takes two files as input and produces three text columns as output: lines only in the first file; lines only in the second file; and lines in both files. You can suppress printing of any column by using -1, -2 or -3 option accordingly.,Finally, there is join, a utility command that performs an equality join on the specified files. Its -v option also allows to remove common lines between two files.,The latest programming-related news, articles and resources - sent to your inbox monthly. Unsubscribe anytime.,To remove common lines between two files you can use grep, comm or join command.

grep only works for small files. Use -v along with -f.

grep - vf file2 file1
load more v
40%

# Example 1
awk 'NR==FNR{a[$0];next} !($0 in a)'
file_2 file_1
# This returns all lines in file_1 that are not found in file_2
# See source
for more info on how the command works
22%

sort -uk2 sorts the lines based on the second column (k2 option) and keeps only the first occurrence of the lines with the same second column value (u option).,sort -nk1 sorts the lines based on their first column (k1 option) treating the column as a number (-n option).,Finally, cut -f2- prints each line starting from the second column until its end (-f2- option: Note the - suffix, which instructs it to include the rest of the line).,The $0 variable holds the contents of the line currently being processed.

To remove the duplicate lines while preserving their order in the file, use:

awk '!visited[$0]++'
your_file > deduplicated_file
load more v
60%

Now I want to remove all the -----, but only if they're repeating after each other. So the result should look like this:, How can I overcome the setbacks of a negligent supervisor after earning a PhD degree? ,just do uniq filename.txt As the name states only unique lines are left and the repetitions are merged,Asking for help, clarification, or responding to other answers.

That's exactly what the uniq command is made for:

NAME
uniq - report or omit repeated lines

SYNOPSIS
uniq[OPTION]...[INPUT[OUTPUT]]

DESCRIPTION
Filter adjacent matching lines from INPUT(or standard input), writing to OUTPUT(or standard output).

With no options, matching lines are merged to the first occurrence.
load more v

Other "lines-remove" queries related to "How to remove lines based on another file? [duplicate]"