Hi everyone,

I have two questions:

Question 1. I created this sample dataset (test):

code Age open outcome

1 A 12 0 1

2 B 15 0 0

3 C 18 0 1

4 D 12 1 0

5 E 18 1 1

6 F 20 1 0

When I run this command:

todrop <- c("outcome", "code")

cem2 <- cem (treatment = "open", data = test, drop = todrop , k2k=TRUE)

I get this data back :

code Age open outcome

1 A 12 0 1

2 C 18 0 1

3 D 12 1 0

4 F 20 1 0

When I use matchit

match <- matchit(open ~ Age, test, method = "exact")

I get this result

code Age open outcome weights subclass

1 A 12 0 1 1 1

3 C 18 0 1 1 2

4 D 12 1 0 1 1

5 E 18 1 1 1 2

So, my question is why CEM does not chose the record "E" with age 18 and chooses the one with age 20. Is the exact method in matchit more accurate than CEM in this case?

Question 2. I have a database with 140k records and 440 variables, which I want to match on only 20 variables. If I want to use CEM, is there an easy way to include those 20 variables, and not drop the other 420?

Thanks a lot in advance.

-Ashkan