How to extract duplicate numbers from a log file? [closed]

-1

I have some log files in one of my server having the below mentioned log entries.

FTM.FC103.20181228034503.20181228035250:2018-12-28 08:19:59.893 FAIL DROP: Too many resend tries failed Failed for request id: 8397796 Cause: unknown Info: Code: ,USSD RequestId=8397796 OriginalId=8397545 EventCorrelationI
d="03a4264124" CreationTime="20181228081949" ResendCount=1 Timestamp=1545968994377 (Fri Dec 28 08:19:54 AFT 2018) State=STATE_SENT SubscriberNumber=96700606310 UssdText=Last event was charged 3.00 RYL, Duration 0:00:52, Remaining balance
35.29 AFN and will expire 25.12.2020.1500 RYL = 32GB valid 30 Days, Dial *811*32*1#. NumberingPlan=1 Nadi=4 UssdFormat=2

I wanted to extract the following information from these logs:

1- Extract all SubscriberNumber from the log files.

2- Then find the SubscriberNumbers which have multiple occurrences in the logs.

edited Dec 30 '18 at 7:54

Rui F Ribeiro

39.4k1479131

asked Dec 30 '18 at 5:41

Jack Anderson

closed as too broad by G-Man, αғsнιη, Thomas, Archemar, peterh Jan 1 at 4:31

Please edit the question to limit it to a specific problem with enough detail to identify an adequate answer. Avoid asking multiple distinct questions at once. See the How to Ask page for help clarifying this question. If this question can be reworded to fit the rules in the help center, please edit the question.

add a comment |

-1

I have some log files in one of my server having the below mentioned log entries.

I wanted to extract the following information from these logs:

1- Extract all SubscriberNumber from the log files.

2- Then find the SubscriberNumbers which have multiple occurrences in the logs.

edited Dec 30 '18 at 7:54

Rui F Ribeiro

39.4k1479131

asked Dec 30 '18 at 5:41

Jack Anderson

closed as too broad by G-Man, αғsнιη, Thomas, Archemar, peterh Jan 1 at 4:31

add a comment |

-1

I have some log files in one of my server having the below mentioned log entries.

I wanted to extract the following information from these logs:

1- Extract all SubscriberNumber from the log files.

2- Then find the SubscriberNumbers which have multiple occurrences in the logs.

edited Dec 30 '18 at 7:54

Rui F Ribeiro

39.4k1479131

asked Dec 30 '18 at 5:41

Jack Anderson

I have some log files in one of my server having the below mentioned log entries.

I wanted to extract the following information from these logs:

1- Extract all SubscriberNumber from the log files.

2- Then find the SubscriberNumbers which have multiple occurrences in the logs.

linux shell

edited Dec 30 '18 at 7:54

Rui F Ribeiro

39.4k1479131

asked Dec 30 '18 at 5:41

Jack Anderson

edited Dec 30 '18 at 7:54

Rui F Ribeiro

39.4k1479131

asked Dec 30 '18 at 5:41

Jack Anderson

edited Dec 30 '18 at 7:54

Rui F Ribeiro

39.4k1479131

edited Dec 30 '18 at 7:54

Rui F Ribeiro

39.4k1479131

edited Dec 30 '18 at 7:54

Rui F Ribeiro

39.4k1479131

asked Dec 30 '18 at 5:41

Jack Anderson

asked Dec 30 '18 at 5:41

Jack Anderson

asked Dec 30 '18 at 5:41

Jack Anderson

closed as too broad by G-Man, αғsнιη, Thomas, Archemar, peterh Jan 1 at 4:31

add a comment |

1 Answer
1

active

oldest

votes

You could use:

grep -oP 'SubscriberNumber=K(d+)' logfile | sort -n | uniq -cd

grep -oP 'SubscriberNumber=K(d+)' logfile isolates all individual SubscriberNumbers from your logfile;

sort -n sorts them numerically, and

uniq -cd prints any duplicate numbers, i.e. those with multiple occurrences, including a count.

edited Dec 30 '18 at 8:27

answered Dec 30 '18 at 7:20

ozzy

4414

Many Thanks dear Ozzy, can you plz also help in counting the duplicate occurrences of each number? for example: 11 ---- 967090099887
– Jack Anderson
Dec 30 '18 at 8:10

Thanks again dear Ozzy. I have got the duplicate occurrences. now is it possible if I can get the list of those numbers having maximum duplicate occurrences for example between 20 - 100 or let say more than 30? grep -oP 'SubscriberNumber=K(d+)' 30.unknown.txt | sort -n | uniq -cd 3 96700000165 2 96700000584 23 96700001632 6 96700001744 4 96700001876 2 96700002632 2 96700003071 2 96700004656 3 96700004948 10 96700006053 2 96700007154 2 96700007248
– Jack Anderson
Dec 30 '18 at 9:47

2

@JackAnderson It would be easier to help you if you'd first formulate a complete question. Alternatively, you can mark the original question as answered and open a new one. Answering an iteratively changing question may require repeated revisions of the answer, and even backtracking on a chosen approach.
– ozzy
Dec 30 '18 at 10:09

Dear Ozzy, can you please also explain the purpose of following switches? K(d+
– Jack Anderson
Dec 31 '18 at 7:05

@JackAnderson The d matches any digit; (d+) is a group of one or more consecutive digits. The K is explained here: stackoverflow.com/questions/33573920/….
– ozzy
Dec 31 '18 at 8:10

|
show 1 more comment

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

You could use:

grep -oP 'SubscriberNumber=K(d+)' logfile | sort -n | uniq -cd

grep -oP 'SubscriberNumber=K(d+)' logfile isolates all individual SubscriberNumbers from your logfile;

sort -n sorts them numerically, and

uniq -cd prints any duplicate numbers, i.e. those with multiple occurrences, including a count.

edited Dec 30 '18 at 8:27

answered Dec 30 '18 at 7:20

ozzy

4414

Many Thanks dear Ozzy, can you plz also help in counting the duplicate occurrences of each number? for example: 11 ---- 967090099887
– Jack Anderson
Dec 30 '18 at 8:10

Thanks again dear Ozzy. I have got the duplicate occurrences. now is it possible if I can get the list of those numbers having maximum duplicate occurrences for example between 20 - 100 or let say more than 30? grep -oP 'SubscriberNumber=K(d+)' 30.unknown.txt | sort -n | uniq -cd 3 96700000165 2 96700000584 23 96700001632 6 96700001744 4 96700001876 2 96700002632 2 96700003071 2 96700004656 3 96700004948 10 96700006053 2 96700007154 2 96700007248
– Jack Anderson
Dec 30 '18 at 9:47

2

@JackAnderson It would be easier to help you if you'd first formulate a complete question. Alternatively, you can mark the original question as answered and open a new one. Answering an iteratively changing question may require repeated revisions of the answer, and even backtracking on a chosen approach.
– ozzy
Dec 30 '18 at 10:09

Dear Ozzy, can you please also explain the purpose of following switches? K(d+
– Jack Anderson
Dec 31 '18 at 7:05

@JackAnderson The d matches any digit; (d+) is a group of one or more consecutive digits. The K is explained here: stackoverflow.com/questions/33573920/….
– ozzy
Dec 31 '18 at 8:10

|
show 1 more comment

You could use:

grep -oP 'SubscriberNumber=K(d+)' logfile | sort -n | uniq -cd

grep -oP 'SubscriberNumber=K(d+)' logfile isolates all individual SubscriberNumbers from your logfile;

sort -n sorts them numerically, and

uniq -cd prints any duplicate numbers, i.e. those with multiple occurrences, including a count.

edited Dec 30 '18 at 8:27

answered Dec 30 '18 at 7:20

ozzy

4414

Many Thanks dear Ozzy, can you plz also help in counting the duplicate occurrences of each number? for example: 11 ---- 967090099887
– Jack Anderson
Dec 30 '18 at 8:10

Thanks again dear Ozzy. I have got the duplicate occurrences. now is it possible if I can get the list of those numbers having maximum duplicate occurrences for example between 20 - 100 or let say more than 30? grep -oP 'SubscriberNumber=K(d+)' 30.unknown.txt | sort -n | uniq -cd 3 96700000165 2 96700000584 23 96700001632 6 96700001744 4 96700001876 2 96700002632 2 96700003071 2 96700004656 3 96700004948 10 96700006053 2 96700007154 2 96700007248
– Jack Anderson
Dec 30 '18 at 9:47

2

@JackAnderson It would be easier to help you if you'd first formulate a complete question. Alternatively, you can mark the original question as answered and open a new one. Answering an iteratively changing question may require repeated revisions of the answer, and even backtracking on a chosen approach.
– ozzy
Dec 30 '18 at 10:09

Dear Ozzy, can you please also explain the purpose of following switches? K(d+
– Jack Anderson
Dec 31 '18 at 7:05

@JackAnderson The d matches any digit; (d+) is a group of one or more consecutive digits. The K is explained here: stackoverflow.com/questions/33573920/….
– ozzy
Dec 31 '18 at 8:10

|
show 1 more comment

You could use:

grep -oP 'SubscriberNumber=K(d+)' logfile | sort -n | uniq -cd

grep -oP 'SubscriberNumber=K(d+)' logfile isolates all individual SubscriberNumbers from your logfile;

sort -n sorts them numerically, and

uniq -cd prints any duplicate numbers, i.e. those with multiple occurrences, including a count.

edited Dec 30 '18 at 8:27

answered Dec 30 '18 at 7:20

ozzy

4414

You could use:

grep -oP 'SubscriberNumber=K(d+)' logfile | sort -n | uniq -cd

grep -oP 'SubscriberNumber=K(d+)' logfile isolates all individual SubscriberNumbers from your logfile;

sort -n sorts them numerically, and

uniq -cd prints any duplicate numbers, i.e. those with multiple occurrences, including a count.

edited Dec 30 '18 at 8:27

answered Dec 30 '18 at 7:20

ozzy

4414

edited Dec 30 '18 at 8:27

answered Dec 30 '18 at 7:20

ozzy

4414

answered Dec 30 '18 at 7:20

ozzy

4414

answered Dec 30 '18 at 7:20

ozzy

4414

Many Thanks dear Ozzy, can you plz also help in counting the duplicate occurrences of each number? for example: 11 ---- 967090099887
– Jack Anderson
Dec 30 '18 at 8:10

Thanks again dear Ozzy. I have got the duplicate occurrences. now is it possible if I can get the list of those numbers having maximum duplicate occurrences for example between 20 - 100 or let say more than 30? grep -oP 'SubscriberNumber=K(d+)' 30.unknown.txt | sort -n | uniq -cd 3 96700000165 2 96700000584 23 96700001632 6 96700001744 4 96700001876 2 96700002632 2 96700003071 2 96700004656 3 96700004948 10 96700006053 2 96700007154 2 96700007248
– Jack Anderson
Dec 30 '18 at 9:47

2

@JackAnderson It would be easier to help you if you'd first formulate a complete question. Alternatively, you can mark the original question as answered and open a new one. Answering an iteratively changing question may require repeated revisions of the answer, and even backtracking on a chosen approach.
– ozzy
Dec 30 '18 at 10:09

Dear Ozzy, can you please also explain the purpose of following switches? K(d+
– Jack Anderson
Dec 31 '18 at 7:05

@JackAnderson The d matches any digit; (d+) is a group of one or more consecutive digits. The K is explained here: stackoverflow.com/questions/33573920/….
– ozzy
Dec 31 '18 at 8:10

|
show 1 more comment

Many Thanks dear Ozzy, can you plz also help in counting the duplicate occurrences of each number? for example: 11 ---- 967090099887
– Jack Anderson
Dec 30 '18 at 8:10

Thanks again dear Ozzy. I have got the duplicate occurrences. now is it possible if I can get the list of those numbers having maximum duplicate occurrences for example between 20 - 100 or let say more than 30? grep -oP 'SubscriberNumber=K(d+)' 30.unknown.txt | sort -n | uniq -cd 3 96700000165 2 96700000584 23 96700001632 6 96700001744 4 96700001876 2 96700002632 2 96700003071 2 96700004656 3 96700004948 10 96700006053 2 96700007154 2 96700007248
– Jack Anderson
Dec 30 '18 at 9:47

2

@JackAnderson It would be easier to help you if you'd first formulate a complete question. Alternatively, you can mark the original question as answered and open a new one. Answering an iteratively changing question may require repeated revisions of the answer, and even backtracking on a chosen approach.
– ozzy
Dec 30 '18 at 10:09

Dear Ozzy, can you please also explain the purpose of following switches? K(d+
– Jack Anderson
Dec 31 '18 at 7:05

@JackAnderson The d matches any digit; (d+) is a group of one or more consecutive digits. The K is explained here: stackoverflow.com/questions/33573920/….
– ozzy
Dec 31 '18 at 8:10

Many Thanks dear Ozzy, can you plz also help in counting the duplicate occurrences of each number? for example: 11 ---- 967090099887
– Jack Anderson
Dec 30 '18 at 8:10

Thanks again dear Ozzy. I have got the duplicate occurrences. now is it possible if I can get the list of those numbers having maximum duplicate occurrences for example between 20 - 100 or let say more than 30? grep -oP 'SubscriberNumber=K(d+)' 30.unknown.txt | sort -n | uniq -cd 3 96700000165 2 96700000584 23 96700001632 6 96700001744 4 96700001876 2 96700002632 2 96700003071 2 96700004656 3 96700004948 10 96700006053 2 96700007154 2 96700007248
– Jack Anderson
Dec 30 '18 at 9:47

@JackAnderson It would be easier to help you if you'd first formulate a complete question. Alternatively, you can mark the original question as answered and open a new one. Answering an iteratively changing question may require repeated revisions of the answer, and even backtracking on a chosen approach.
– ozzy
Dec 30 '18 at 10:09

Dear Ozzy, can you please also explain the purpose of following switches? K(d+
– Jack Anderson
Dec 31 '18 at 7:05

@JackAnderson The d matches any digit; (d+) is a group of one or more consecutive digits. The K is explained here: stackoverflow.com/questions/33573920/….
– ozzy
Dec 31 '18 at 8:10

|
show 1 more comment

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Cfrtjryk