2016-11-07 2 views
0

Ich habe eine monatliche Zeitreihe, die in eine Matrix umgewandelt werden sollte. (6 Jahre, aber die letzten 4 Monate fehlen!)R: Transformieren unvollständiger Zeitreihen in Matrix

 Jan  Feb  Mar  Apr  May  Jun  Jul  Aug  Sep  Oct  Nov  Dec 
1 62.17416 66.34748 69.58154 64.92033 70.51246 70.83153 74.19475 72.29366 77.97062 71.37903 67.08182 69.10599 
2 59.41197 61.44036 68.97711 72.43558 69.92715 74.20776 79.19150 71.48650 77.56661 67.44390 67.05218 66.43690 
3 58.79278 63.26696 70.97837 71.79423 72.04037 71.61184 72.64526 70.38479 72.43492 64.42046 66.87742 66.31345 
4 58.06906 60.35554 70.91920 65.76937 68.85464 68.77921 70.09373 66.03685 71.38322 66.42558 67.62292 70.25223 
5 60.59774 64.66411 73.27853 73.11031 73.57735 73.77766 78.45258 74.64698 79.07911 75.84108 75.82478 79.42484 
6 69.30907 70.80211 78.38037 77.63537 79.67396 80.15397 80.45571 76.67389 

Es gibt einige Werte im letzten Jahr der Zeitreihe fehlt. Bei der Umwandlung in eine Matrix ersetzt die Funktion die fehlenden Werte durch die Verwendung der ersten aus der Zeitreihe.

> matrix(xx,nrow=6,ncol=12,byrow = TRUE) 
     [,1]  [,2]  [,3]  [,4]  [,5]  [,6]  [,7]  [,8]  [,9] [,10] [,11] [,12] 
[1,] 62.17416 66.34748 69.58154 64.92033 70.51246 70.83153 74.19475 72.29366 77.97062 71.37903 67.08182 69.10599 
[2,] 59.41197 61.44036 68.97711 72.43558 69.92715 74.20776 79.19150 71.48650 77.56661 67.44390 67.05218 66.43690 
[3,] 58.79278 63.26696 70.97837 71.79423 72.04037 71.61184 72.64526 70.38479 72.43492 64.42046 66.87742 66.31345 
[4,] 58.06906 60.35554 70.91920 65.76937 68.85464 68.77921 70.09373 66.03685 71.38322 66.42558 67.62292 70.25223 
[5,] 60.59774 64.66411 73.27853 73.11031 73.57735 73.77766 78.45258 74.64698 79.07911 75.84108 75.82478 79.42484 
[6,] 69.30907 70.80211 78.38037 77.63537 79.67396 80.15397 80.45571 76.67389 62.17416 66.34748 69.58154 64.92033 

Wie kann man dies vermeiden oder die letzten Werte durch "NA" ersetzen?

+1

'Matrix (c (xx, rep (NA, 4)) , nrow = 6, ncol = 12, byrow = WAHR) '? – nrussell

+0

@nrussell Yey (y) Danke! – NMe

Antwort

1

Angenommen xx ist die in der Anmerkung am Ende angegebene Zeitreihe.

1) Dann haben wir es zu einer Matrix, ohne zu wissen, wie viele fehlende Elemente umwandeln kann es wie folgt:

tt <- time(xx) 
tapply(xx, list(floor(tt), cycle(tt)), c) 

geben diese 6 x 12 Matrix:

  1  2  3  4  5  6  7  8 
2011 62.17416 66.34748 69.58154 64.92033 70.51246 70.83153 74.19475 72.29366 
2012 59.41197 61.44036 68.97711 72.43558 69.92715 74.20776 79.19150 71.48650 
2013 58.79278 63.26696 70.97837 71.79423 72.04037 71.61184 72.64526 70.38479 
2014 58.06906 60.35554 70.91920 65.76937 68.85464 68.77921 70.09373 66.03685 
2015 60.59774 64.66411 73.27853 73.11031 73.57735 73.77766 78.45258 74.64698 
2016 69.30907 70.80211 78.38037 77.63537 79.67396 80.15397 80.45571 76.67389 
      9  10  11  12 
2011 77.97062 71.37903 67.08182 69.10599 
2012 77.56661 67.44390 67.05218 66.43690 
2013 72.43492 64.42046 66.87742 66.31345 
2014 71.38322 66.42558 67.62292 70.25223 
2015 79.07911 75.84108 75.82478 79.42484 
2016  NA  NA  NA  NA 

2) Eine andere Möglichkeit besteht darin, die Länge auf das nächsthöchste Vielfache von 12 zu zwingen, so dass sie eine Matrix exakt ausfüllt und neu formt. Dies funktioniert nur, wenn die hinzuzufügenden NAs am Ende sind, wie es in der Frage der Fall ist.

m <- xx # copy xx in case we still need it later 
freq <- frequency(xx) # 12 
length(m) <- freq * ceiling(length(m)/freq) # pads with NAs to achieve length 
m <- matrix(m,, freq, byrow = TRUE) # reshape m into a matrix 

geben:

> m 
     [,1]  [,2]  [,3]  [,4]  [,5]  [,6]  [,7]  [,8] 
[1,] 62.17416 66.34748 69.58154 64.92033 70.51246 70.83153 74.19475 72.29366 
[2,] 59.41197 61.44036 68.97711 72.43558 69.92715 74.20776 79.19150 71.48650 
[3,] 58.79278 63.26696 70.97837 71.79423 72.04037 71.61184 72.64526 70.38479 
[4,] 58.06906 60.35554 70.91920 65.76937 68.85464 68.77921 70.09373 66.03685 
[5,] 60.59774 64.66411 73.27853 73.11031 73.57735 73.77766 78.45258 74.64698 
[6,] 69.30907 70.80211 78.38037 77.63537 79.67396 80.15397 80.45571 76.67389 
     [,9] [,10] [,11] [,12] 
[1,] 77.97062 71.37903 67.08182 69.10599 
[2,] 77.56661 67.44390 67.05218 66.43690 
[3,] 72.43492 64.42046 66.87742 66.31345 
[4,] 71.38322 66.42558 67.62292 70.25223 
[5,] 79.07911 75.84108 75.82478 79.42484 
[6,]  NA  NA  NA  NA 

Anmerkung: Die Eingabezeitreihe xx in reproduzierbarer Form ist:

xx <- ts(c(62.17416, 66.34748, 69.58154, 64.92033, 70.51246, 
70.83153, 74.19475, 72.29366, 77.97062, 71.37903, 67.08182, 69.10599, 
59.41197, 61.44036, 68.97711, 72.43558, 69.92715, 74.20776, 79.1915, 
71.4865, 77.56661, 67.4439, 67.05218, 66.4369, 58.79278, 63.26696, 
70.97837, 71.79423, 72.04037, 71.61184, 72.64526, 70.38479, 72.43492, 
64.42046, 66.87742, 66.31345, 58.06906, 60.35554, 70.9192, 65.76937, 
68.85464, 68.77921, 70.09373, 66.03685, 71.38322, 66.42558, 67.62292, 
70.25223, 60.59774, 64.66411, 73.27853, 73.11031, 73.57735, 73.77766, 
78.45258, 74.64698, 79.07911, 75.84108, 75.82478, 79.42484, 69.30907, 
70.80211, 78.38037, 77.63537, 79.67396, 80.15397, 80.45571, 76.67389 
), start = c(2011, 1), freq = 12)