Dies könnte das meiste von dem tun, was Sie wollen. Ich verstand nicht, die rbind
Teil Ihrer Frage:
poo <- read.table(text = '
TRIAL_INDEX RIGHT_PUPIL_SIZE
1 10
1 8
1 6
1 4
1 NA
2 1
2 2
2 NA
2 4
2 5
', header = TRUE, stringsAsFactors = FALSE, na.strings = "NA")
my.summary <- as.data.frame(do.call("rbind", tapply(poo$RIGHT_PUPIL_SIZE, poo$TRIAL_INDEX,
function(x) c(index.sd = sd(x, na.rm = TRUE), index.mean = mean(x, na.rm = TRUE)))))
my.summary$TRIAL_INDEX <- rownames(my.summary)
poo <- merge(poo, my.summary, by = 'TRIAL_INDEX')
poo$RIGHT_PUPIL_SIZE <- ifelse((poo$RIGHT_PUPIL_SIZE > (poo$index.mean + 3 * poo$index.sd)) |
(poo$RIGHT_PUPIL_SIZE < (poo$index.mean - 3 * poo$index.sd)) |
is.na(poo$RIGHT_PUPIL_SIZE), NA, poo$RIGHT_PUPIL_SIZE)
poo
# TRIAL_INDEX RIGHT_PUPIL_SIZE index.sd index.mean
#1 1 10 2.581989 7
#2 1 8 2.581989 7
#3 1 6 2.581989 7
#4 1 4 2.581989 7
#5 1 NA 2.581989 7
#6 2 1 1.825742 3
#7 2 2 1.825742 3
#8 2 NA 1.825742 3
#9 2 4 1.825742 3
#10 2 5 1.825742 3
Hier ist eine Lösung aggregate
verwendet, ist:
my.summary <- with(poo, aggregate(RIGHT_PUPIL_SIZE, by = list(TRIAL_INDEX),
FUN = function(x) { c(index.sd = sd(x, na.rm = TRUE),
index.mean = mean(x, na.rm = TRUE)) }))
my.summary <- do.call(data.frame, my.summary)
colnames(my.summary) <- c('TRIAL_INDEX', 'index.sd', 'index.mean')
poo <- merge(poo, my.summary, by = 'TRIAL_INDEX')
poo$RIGHT_PUPIL_SIZE <- ifelse((poo$RIGHT_PUPIL_SIZE > (poo$index.mean + 3 * poo$index.sd)) |
(poo$RIGHT_PUPIL_SIZE < (poo$index.mean - 3 * poo$index.sd)) |
is.na(poo$RIGHT_PUPIL_SIZE), NA, poo$RIGHT_PUPIL_SIZE)
Hier ist eine Lösung ave
verwendet, ist:
index.mean <- ave(poo$RIGHT_PUPIL_SIZE, poo$TRIAL_INDEX, FUN = function(x) mean(x, na.rm = TRUE))
index.sd <- ave(poo$RIGHT_PUPIL_SIZE, poo$TRIAL_INDEX, FUN = function(x) sd(x, na.rm = TRUE))
poo <- data.frame(poo, index.mean, index.sd)
poo$RIGHT_PUPIL_SIZE <- ifelse((poo$RIGHT_PUPIL_SIZE > (poo$index.mean + 3 * poo$index.sd)) |
(poo$RIGHT_PUPIL_SIZE < (poo$index.mean - 3 * poo$index.sd)) |
is.na(poo$RIGHT_PUPIL_SIZE), NA, poo$RIGHT_PUPIL_SIZE)
Hier ist eine Lösung mit dplyr
, die ein wenig von der dplyr
Lösung von Dave2e unterscheidet. Sein ist wahrscheinlich besser, da ich nie dplyr
bis zum Posten dieser Antwort benutzt habe.
library(dplyr)
my.summary <- poo %>%
group_by(TRIAL_INDEX) %>%
summarise(index.mean = mean(RIGHT_PUPIL_SIZE, na.rm = TRUE),
index.sd = sd(RIGHT_PUPIL_SIZE, na.rm = TRUE))
my.summary
poo <- merge(poo, as.data.frame(my.summary), by = 'TRIAL_INDEX')
poo$RIGHT_PUPIL_SIZE <- ifelse((poo$RIGHT_PUPIL_SIZE > (poo$index.mean + 3 * poo$index.sd)) |
(poo$RIGHT_PUPIL_SIZE < (poo$index.mean - 3 * poo$index.sd)) |
is.na(poo$RIGHT_PUPIL_SIZE), NA, poo$RIGHT_PUPIL_SIZE)
poo
Hier ist eine Lösung mit data.table
. Es gibt wahrscheinlich bessere Lösungen mit data.table
. Ich denke, ich habe nur data.table
einmal vor dem Posten dieser Antwort verwendet.
poo <- read.table(text = '
TRIAL_INDEX RIGHT_PUPIL_SIZE
1 10
1 8
1 6
1 4
1 NA
2 1
2 2
2 NA
2 4
2 5
', header = TRUE, stringsAsFactors = FALSE, na.strings = "NA")
library(data.table)
my.summary <- data.frame(setDT(poo)[, .(index.mean = mean(RIGHT_PUPIL_SIZE, na.rm = TRUE),
index.sd = sd(RIGHT_PUPIL_SIZE, na.rm = TRUE)),
.(TRIAL_INDEX)])
poo <- merge(poo, my.summary, by = 'TRIAL_INDEX')
poo$RIGHT_PUPIL_SIZE <- ifelse((poo$RIGHT_PUPIL_SIZE > (poo$index.mean + 3 * poo$index.sd)) |
(poo$RIGHT_PUPIL_SIZE < (poo$index.mean - 3 * poo$index.sd)) |
is.na(poo$RIGHT_PUPIL_SIZE), NA, poo$RIGHT_PUPIL_SIZE)
poo
können Sie uns eine Probe Ihres 'Kumpels' geben – rawr