dataframe - Testing for 1-level factors in R -
i have large dataset i'm breaking down smaller data frames based on 1 of factors: state. unfortunately, states, have little data (alaska, example). when run basic model on smaller data frames problems 1 of factors (a gender variable that's 'm' or 'f').
i'm using loop set each state's data frame. planning on building if statement run model if didn't have 1-level factor. don't know how build if.
states_list<-c("ak", ... "wy") # shortened brevity resultslist<-list() j<-1 (i in states_list){ temp_data<-raw[raw$state==i,] fac <- min(factor(temp_data) # <- part don't have right if(fac > 1){ model<-lm(y_var~gender,data=temp_data) resultslist[[j]]<-summary(model) } else { print(i) print("doesn't have enough data points") } j=j+1 } thanks -w
you don't need use loops , i'd recommend using broom package save models output dataframe, can access value need.
library(dplyr) library(broom) # example dataframe dt = data.frame(state = c(rep("aa",20), rep("bb",15)), gender = c(rep("m",10), rep("f",10), rep("m",15)), value = rnorm(35, 100, 5), stringsasfactors = f) dt %>% group_by(state) %>% # each state mutate(numuniquegenders = n_distinct(gender)) %>% # count how many unique values of gender have (and add each row) filter(numuniquegenders == 2) %>% # keep rows belong state both m , f do(tidy(lm(value ~ gender, data=.))) %>% # run model , save output dataframe ungroup # forget grouping # # tibble: 2 x 6 # state term estimate std.error statistic p.value # <chr> <chr> <dbl> <dbl> <dbl> <dbl> # 1 aa (intercept) 99.9643476 1.092526 91.4983355 1.787784e-25 # 2 aa genderm 0.6236803 1.545066 0.4036594 6.912182e-01 so, in end you'll dataframe has 2 rows each state. 1 intercept , 1 gender.
Comments
Post a Comment