Skip to content

Commit 7f68606

Browse files
authored
Merge pull request #58 from NithinRamuSAS/main
Added example to remove duplicated rows by columns
2 parents ccc842b + 28f487c commit 7f68606

File tree

2 files changed

+69
-0
lines changed

2 files changed

+69
-0
lines changed
Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
/************************************************************************************************
2+
REMOVE DUPLICATED ROWS by COLUMNS
3+
This program sort the input table by specified columns to ensure rows with same values in
4+
these columns are adjacent. Furthermore, the sorting ensures that the first record in
5+
the group has the highest value based on a column. Then, all subsequent (or duplicate)
6+
rows based on specified columns are removed.
7+
Keywords: PROC SORT, remove duplicates
8+
SAS Versions: SAS 9, SAS Viya
9+
Documentation: https://documentation.sas.com/?cdcId=pgmsascdc&cdcVersion=default&docsetId=proc&docsetTarget=p1nd17xr6wof4sn19zkmid81p926.htm
10+
1. Create the WORK.CLASSTEST table and print it.
11+
2. PROC SORT sorts rows by specified columns (here: Name) so that duplicated
12+
rows will be sequential, and the first record in the group has the highest value for a
13+
given column (here: Score).
14+
3. Another PROC SORT step with the NODUPKEY option deletes rows with duplicate BY values
15+
by the specified columns (here: Name), thus leaving only one row (here:the one with the
16+
highest Score for each Name).
17+
************************************************************************************************/
18+
19+
20+
data classtest; /*1*/
21+
infile datalines dsd;
22+
input
23+
Name :$7.
24+
Subject :$7.
25+
Score;
26+
datalines4;
27+
Judy,Reading,91
28+
Judy,Math,79
29+
Barbara,Math,90
30+
Barbara,Reading,86
31+
Louise,Math,72
32+
Louise,Reading,65
33+
William,Math,61
34+
William,Reading,71
35+
Henry,Math,62
36+
Henry,Reading,75
37+
Henry,Reading,84
38+
Jane,Math,94
39+
Jane,Reading,96
40+
;;;;
41+
run;
42+
43+
proc sort data=classtest out=classtest_sort; /*2*/
44+
by Name descending Score;
45+
run;
46+
47+
proc sort data=classtest_sort out=classtest_nodup nodupkey; /*3*/
48+
by Name;
49+
run;
Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
/************************************************************************************************
2+
PRINT EXTREME OBSERVATIONS
3+
This program prints a specified amount of observations with the highest and lowest values
4+
in a specified variable. It labels these observations with an identifier variable.
5+
Keywords: PROC UNIVARIATE
6+
SAS Versions: SAS 9, SAS Viya
7+
Documentation: https://documentation.sas.com/doc/en/pgmsascdc/v_066/procstat/procstat_univariate_toc.htm
8+
1. Specify the output table ODS should display from the following PROC UNIVARIATE step.
9+
2. PROC UNIVARIATE prints the N (here: 3) observations from the specified input
10+
(here: SASHELP.CLASS) table
11+
3. Specify the variable based on which the highest and lowest observations are selected
12+
(here: Height).
13+
4. A specified (here: Name) variable is used to identify these observations.
14+
************************************************************************************************/
15+
16+
ods select ExtremeObs; /*1*/
17+
proc univariate data=sashelp.class nextrobs=3; /*2*/
18+
var Height; /*3*/
19+
id Name; /*4*/
20+
run;

0 commit comments

Comments
 (0)