Removing second record for the same ID

Question

Removing second record for the same ID

61 views Asked by user8098993 At 22 February 2023 at 14:44

I have a dataset in Stata in which an ID can have a maximum of two different records. I am trying to restrict my dataset so that the second record is removed if the result for test A is positive. I have provided an example of a sample dataset below. In my desired output I would like to keep both records for ID 1, the first record for ID 2 (test A record), and the record for ID 3 and ID 4.

      ID    |      Test Type  Result    
------------+-----------------------------------
          1 |          A       -ve       
          1 |          B       +ve      
          2 |          A       +ve       
          2 |          B       +ve     
          3 |          A       +ve 
          4 |          A       -ve  
------------+-----------------------------------

Currently my dataset is sorted by ID, but I am unable to figure out how to restrict the dataset based on my desired criteria described above.

Original Q&A

There are 3 answers

**Shuo** · Answer 1 · 2023-02-22T15:28:34+00:00

Create a new dataset which has only ID and Test Type A's result.

Merge the new dataset with original data, then drop observations in which result_a is positive and Test Type is B.

I haven't tested the code because I don't have Stata on my pc, but this should work.

preserve
keep if TestType == "A"
keep ID Result
ren Result Result_a
save a.dta, replace
restore

merge m:1 ID using a.dta

drop if Result_a == "+ve" & TestType == "B"

**Ignacio2424** · Answer 2 · 2023-02-22T15:47:17+00:00

Ignacio2424 On 22 February 2023 at 15:47

try something like:

by ID: gen aux1=_n
bys ID Result: gen aux2=_N
drop if aux1!=1&aux2!=1&Result=="+ve"

Edit based on comments.

**Nick Cox** · Answer 3 · 2023-02-22T19:10:23+00:00

This may help:

* Example generated by -dataex-. For more info, type help dataex
clear
input byte id str1 test_type str3 result
1 "A" "-ve"
1 "B" "+ve"
2 "A" "+ve"
2 "B" "+ve"
3 "A" "+ve"
4 "A" "-ve"
end

bysort id (test_type) : drop if _n > 1 & result[1] == "+ve"


. list, sepby(id)

     +------------------------+
     | id   test_t~e   result |
     |------------------------|
  1. |  1          A      -ve |
  2. |  1          B      +ve |
     |------------------------|
  3. |  2          A      +ve |
     |------------------------|
  4. |  3          A      +ve |
     |------------------------|
  5. |  4          A      -ve |
     +------------------------+

TechQA.

Removing second record for the same ID

There are 3 answers

Related Questions in STATA

Related Questions in MULTIPLE-RECORDS

Popular Questions

Trending Questions